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Preface 



The idea of writing a book on CMOS imaging has been brewing for 
several years. It was placed on a fast track after we agreed to organize a 
tutorial on CMOS sensors for the 2004 IEEE International Symposium on 
Circuits and Systems (ISCAS 2004). This tutorial defined the structure of the 
book, but as first time authors/editors, we had a lot to learn about the 
logistics of putting together information from multiple sources. Needless to 
say, it was a long road between the tutorial and the book, and it took more 
than a few months to complete. We hope that you will find our journey 
worthwhile and the collated information useful. 

The laboratories of the authors are located at many universities 
distributed around the world. Their unifying theme, however, is the 
advancement of knowledge for the development of systems for CMOS 
imaging and image processing. We hope that this book will highlight the 
ideas that have been pioneered by the authors, while providing a roadmap 
for new practitioners in this field to exploit exciting opportunities to 
integrate imaging and “smartness” on a single VLSI chip. The potential of 
these smart imaging systems is still unfulfilled. Hence, there is still plenty of 
research and development to be done. 

We wish to thank our co-authors, students, administrative assistants, and 
laboratory co-workers for their excitement and enthusiasm for being 
involved in this project. Specifically, we would like to thank Alex Belenky, 
Rachel Mahluf-Zilberberg, and Ruslan Sergienko from the VLSI Systems 
Center at Ben-Gurion University. 

We also would like to thank our mentors, Eric Fossum, Jan van der 
Spiegel, Albert Theuwissen, Mohammed Ismail, Dan McGrath, Eby 




xii CMOS Imagers: From Phototransduction to Image Processing 

Friedman, Andreas Andreou, Norman Kopeika, Zamik Rosenwaks, Irvin 
Heard, and Paul Mueller for their support at different stages of this project. 

Furthermore, we would like to thank our copy-editor, Stan Backs of 
SynchroComm Inc. 

In addition, we would like to thank our publishers, Kluwer Academic 
Publishers, and especially Mark de Jongh for being patient with us all the way. 

Last but not least, we would like to thank our loved ones for their support 
during the process. We hope the missing hours with them are worth the 
result. 

Orly Yadid-Pecht and Ralph Etienne-Cummings 




Introduction 



This book starts with a detailed presentation of the basic concepts of 
photo transduction, modeling, evaluation, and optimization of Active Pixel 
Sensors (APS). It continues with the description of APS design issues using 
a bottom-up strategy, starting from pixels and finishing with image 
processing systems. Various focal-plane image processing alternatives either 
to improve imaging or to extract visual information are presented. The book 
closes with a discussion of a completely non-traditional method for image 
noise suppression that utilizes floating-gate learning techniques. The final 
three chapters in fact provide a glimpse into a potential future of CMOS 
imaging and image processing, where concepts gleaned from other 
disciplines, such biological vision, are combined with alternative mixed- 
signal computation circuits to perform complex visual information 
processing and feature extraction at the focal plane. This benefit of CMOS 
imaging and image processing is still largely unexploited by the commercial 
sector. 

The first chapter reviews the background knowledge and concepts of 
silicon-based photo transduction, and introduces relevant concepts from 
semiconductor physics. Several silicon-based photo detectors are examined, 
including the photodiode and the photogate. This chapter also describes the 
operation of the charge-coupled device (CCD) imager, the predominant 
technology available for digital imaging. CCD technology is compared with 
a promising alternate technology, the APS imager. In addition, the functional 
performances of several basic pixel structures are compared by considering 
them as communication channels and determining their ability to convey 
information about an incident optical signal. At 30 frames per second, 
information rates are similar for charge-, voltage-, and current-mode pixels. 
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Comparable trends are found for their information capacities as the 
photocurrent varies. 

The second chapter deals with the modulation transfer function (MTF) of 
an APS. MTF is one of the most significant factors determining the image 
quality. Unfortunately, characterization of the MTF of semiconductor-based 
focal-plane arrays (FPA) has been typically one of the more difficult and 
error-prone performance testing procedures. Based on a thorough analysis of 
experimental data, a unified model has been developed for estimation of the 
MTF of a general CMOS active pixel sensor for scalable CMOS 
technologies. The model covers the physical diffusion effect together with 
the influence of the geometrical shape of the pixel active area. Excellent 
agreement is reported between the results predicted by the model and the 
MTF calculated from the point spread function (PSF) measurements of an 
actual pixel. This fit confirms the hypothesis that the active area shape and 
the photocarrier diffusion effect are the determining factors of the overall 
MTF behavior of CMOS active pixel sensors, thus allowing the extraction of 
the minority-carrier diffusion length. 

The third chapter deals with photoresponse analysis and pixel shape 
optimization for CMOS APS. A semi-analytical model is developed for the 
estimation of the photoresponse of a photodiode-based CMOS APS. This 
model is based on a thorough analysis of experimental data, and incorporates 
the effects of substrate diffusion as well as geometrical shape and size of the 
photodiode active area. It describes the dependence of pixel response on 
integration photocarriers and on conversion gain. The model also 
demonstrates that the tradeoff between these two conflicting factors can lead 
to an optimal geometry, enabling the extraction of a maximal photoresponse. 
The dependence of the parameters on process and design data is discussed, 
and the degree of accuracy for the photoresponse modeling is assessed. 

The fourth chapter reviews APS design from the basics to more advanced 
system-on-chip examples. Since APS are fabricated in a commonly used 
CMOS process, image sensors with integrated “intelligence” can be 
designed. These sensors are very useful in many scientific, commercial and 
consumer applications. Current state-of-the-art CMOS imagers allow 
integration of all functions required for timing, exposure control, color 
processing, image enhancement, image compression, and analog-to-digital 
conversion (ADC) on the same die. In addition, CMOS imagers offer 
significant advantages and rival traditional charge-coupled devices in terms 
of low power, low voltage and monolithic integration. The chapter presents 
different types of CMOS pixels and introduces the system-on-chip approach, 
showing examples of two “smart” APS imagers: a smart vision system-on- 
chip and a smart tracking sensor. The former is based on a photodiode APS 
with linear output over a wide dynamic range, made possible by random 
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access to each pixel in the array and by the insertion of additional circuitry 
into the pixels. The latter is a smart tracking sensor employing analog non- 
linear winner-take-all (WTA) selection. 

The fifth chapter discusses three systems for imaging and visual 
information processing at the focal plane, using three different 
representations of the incident photon flux density: current-mode, voltage- 
mode, and mixed-mode image processing. This chapter outlines how 
spatiotemporal image processing can be implemented in current and voltage 
modes. A computation-on-readout (COR) scheme is highlighted. This 
scheme maximizes pixel density and multiple processed images to be 
produced in parallel. COR requires little additional area and access time 
compared to a simple imager, and the ratio of imager to processor area 
increases drastically with scaling to CMOS technologies with smaller feature 
size. In some cases, it is necessary to perform computations in a pixel- 
parallel manner while still retaining the imaging density and low-noise 
properties of an APS imager. Hence, an imager that utilizes both current- 
mode and voltage-mode imaging and processing is presented. However, this 
mixed-mode approach has some limitations, and these are described in 
detail. Three case studies show the relative merits of the different approaches 
for focal-plane analog image processing. 

The last chapter investigates stochastic adaptive algorithms for on-line 
correction of spatial non-uniformity in random-access addressable imaging 
systems. An adaptive architecture is implemented in analog VLSI, and is 
integrated with the photo sensors on the focal plane. Random sequences of 
address locations selected with controlled statistics are used to adaptively 
equalize the intensity distribution at variable spatial scales. Through a 
logarithmic transformation of system variables, adaptive gain correction is 
achieved based on offset correction in the logarithmic domain. This idea is 
particularly attractive for compact implementation using translinear floating- 
gate MOS circuits. Furthermore, the same architecture and random 
addressing provide for oversampled binary encoding of the image resulting 
in an equalized intensity histogram. The techniques can be applied to a 
variety of solid-state imagers, such as artificial retinas, active pixel sensors, 
and infrared sensor arrays. Experimental results confirm gain correction and 
histogram equalization in a 64 x 64 pixel adaptive array. 

We hope this book will be interesting and useful for established 
designers, who may benefit from the embedded case studies. In addition, the 
book might help newcomers to appreciate both the general concepts and the 
design details of smart CMOS imaging arrays. Our focus is on the practical 
issues encountered in designing these systems, which will always be useful 
for both experienced and novice designers. 




Chapter 1 

FUNDAMENTALS OF SILICON-BASED 
PHOTOTRANSDUCTION 



Honghao Ji and Pamela A. Abshire 

Department of Electrical and Computer Engineering / Institute for Systems Research 
University of Maryland 
College Park, MD 20742, USA 



Abstract: This chapter reviews background knowledge and concepts of silicon-based 

phototransduction. Relevant concepts from semiconductor physics, imaging 
technology, and information theory are introduced. Several silicon-based 
photodetectors are examined, including the photodiode and the photogate. This 
chapter also describes the operation of the charge-coupled device (CCD) 
imager, the predominant technology available for digital imaging. CCD 
technology is compared with a promising alternate technology, the active pixel 
sensor (APS) imager. In addition, several basic pixel structures are compared 
in terms of their functional performance by considering them as 
communication channels and determining their ability to convey information 
about an incident optical signal. At 30 frames per second, information rates are 
similar for charge-, voltage-, and current-mode pixels as the photocurrent 
varies. Comparable trends are found for their information capacities as the 
photocurrent varies under idealized operating conditions. 

Key words: Photodetector, photoconductor, photodiode, phototransistor, photogate, 

quantum efficiency, noise, charge-coupled device (CCD), CMOS image 
sensor, active pixel sensor (APS), information rate, capacity. 

1.1 Introduction 

Modern photography is a versatile and commercially important 
technology with numerous applications, including cinematography, 
spectrography, astronomy, radiography, and photogrammetry. This chemical 
technology transduces light into a physical representation through a 
sequence of chemical reactions of silver halide films, including exposure to 
light, development using benzene derivatives, and fixation using sodium 
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thiosulphate. Generating copies of the original image requires essentially the 
same procedure. In contrast with conventional film photography, electronic 
imaging represents light electronically by directly transducing optical signals 
into electronic signals using image sensors. Such electronic representations 
enable many applications because processing, storing, and transmitting 
electronic images are all much easier and more readily automated than 
comparable manipulations of the physical material representation generated 
by conventional film photography. 

This chapter reviews fundamental aspects of phototransduction using 
semiconductor image sensors. Concepts of solid-state physics useful for 
understanding the operation of photodetectors are discussed, and several 
common silicon-based photodetectors are described. The two predominant 
image sensor technologies are introduced: charge-coupled device (CCD) 
technology and active pixel sensor (APS) technology. These are compared 
and contrasted with each other to understand the advantages and 
opportunities of each technology. Three different CMOS pixel structures are 
then compared as communication channels by determining the ability of 
each one to convey information about optical signals. 

1.2 Background physics of light sensing 

While a detailed exposition of the interaction between light and matter is 
beyond the scope of this chapter, a solid understanding of semiconductor 
physics and the interaction between semiconductors and light will provide 
insight into the operation of semiconductor imagers. For further details, 
excellent references on this subject are available [1-4]. 

1.2.1 Energy band structure 

Understanding how optical information is transduced into electronic 
information requires understanding the electronic properties of 
semiconductors, starting with the behavior of electronic carriers in 
semiconductors. From quantum mechanics, it is known that electrons bound 
to the nucleus of an isolated atom can only have discrete energy levels 
separated by forbidden gaps where no energy level is allowed [5]. Only one 
electron can occupy each quantum state, so identical energy levels for two 
isolated atoms are split into two similar but distinct energy levels as those 
isolated atoms are brought together. When many atoms come together to 
form a crystal, the previously discrete energy levels of the isolated atoms are 
spread into continuous bands of energy levels that are separated by gaps 
where no energy level is allowed. The energy band structure of electrons in a 
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crystalline solid describes the relationship between the energy and 
momentum of allowed states, and is determined by solving the Schrodinger 
equation: 

y / k {r) = E k y/ k [r) (1.1) 

where h is the reduced Planck constant ( h = hi 2n), m is the mass of particle, 
is the potential energy for an electron, E k is the total energy, and y/ k (V) 
is the wave function. 

According to the Bloch theorem, if the potential energy v{r ) is periodic, 
then the wave function y/ k (r} has the form 

y,fr) = e hi ufr) (1.2) 

where U k (r) is periodic in r and is known as a Bloch function. Thus the 
wave function of an electron in a crystal is the product of a periodic lattice 
function and a plane wave. Periodicity greatly reduces the complexity of 
solving the Schrodinger equation. The wavevector k labeling the wave 
function serves the same role as the wavevector in the wave function for a 
free-space electron, and hk is known as the crystal momentum [2] of an 
electron. For some ranges of momentum, the electron velocity is a linear 
function of its momentum, so the electron in the lattice can be considered as 
a classical particle; Newton’s second law of motion and the law of 
conservation of momentum determine the trajectory of the electron in 
response to an external force or a collision. 

For a one-dimensional lattice with a period of a , the region - 
nl a <k< nl a in momentum space is called the first Brillouin zone. The 
energy band structure is periodic in k with period In la, so it is completely 
determined by the first unit cell, or the first Brillouin zone in k space. In 
three dimensions, the first Brillouin zone has a complicated shape depending 
on the specific crystalline structure. Figure 1-1 shows the Brillouin zone 
along with several of the most important symmetry points for a face- 
centered cubic (fee) lattice, the crystalline structure of semiconductors such 
as Si, Ge, and GaAs. Figure 1-2 shows the calculated energy band structures 
of Si, Ge, and GaAs from the center to the boundary of the first Brillouin 
zone along two important axes of symmetry, <1 1 1> and <100>. 

A prominent feature of Figure 1-2 is the region in which no energy states 
exist. Such forbidden energy regions exist for all crystalline solids. At a 
temperature of absolute zero, the lowest energy band that is not fully 
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Figure 1-1. The Brillouin zone and several important symmetry points for face-centered cubic 
lattices such as diamond and zinc blende, the crystalline structures of Si, Ge, and GaAs. 






Wave Vector 



Figure 1-2. Energy band structure along the <100> and <1 1 1> axes for Ge, Si, and GaAs in 
the first Brillouin zone. (Adapted with permission from S. M. Sze, Physics of Semiconductor 
Devices , New York: Wiley, 1981.) 
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occupied and all higher energy bands are called conduction bands; all lower 
bands are full of electrons and called valence bands. Whereas the lowest 
conduction band is partially occupied for conductors, all conduction bands 
are empty for insulators and semiconductors at absolute zero. The separation 
between the minimum conduction band and maximum valence band 
energies is called the bandgap energy, E g . The main difference between 
insulators and semiconductors is that the value of the bandgap energy E g is 
much larger for insulators than for semiconductors. A semiconductor with its 
valence band maximum and conduction band minimum at the same 
wavevector k is known as a direct bandgap material (examples include 
GaAs and InGaAs). A semiconductor with its valence band maximum and 
conduction band minimum at different wavevectors k is known as an 
indirect bandgap material (examples include Si and Ge). 

Electrons in a crystalline material experience external electric fields as 
well as internal fields generated by other electrons and atoms. When an 
electric field E x is applied to a semiconductor, electrons experience a force 
equal to \qE x \ in a direction opposite to E x , where q is the unit charge. 
Whereas an electron in vacuum experiences a constant acceleration and a 
ballistic trajectory in response to an external force, an electron in a solid also 
experiences viscous drag forces due to collisions with the lattice. The motion 
of electrons having energy near the band edges can be described using the 
effective mass m . They attain an average drift velocity \v x \ = \qzE x !m *\ , 
where r is the average time between collisions. The value of the effective 
mass is given by 



777 = n 



d 2 E 
dk 2 



(1.3) 



where E is electron energy and k is the wavevector. The ratio qr/m* is 
known as the mobility ju. The average velocity of carriers in a semiconductor 
due to an applied electric field E is the product of the mobility and the 
electric field: 



v = /u E 



(1.4) 



1.2.2 Carriers in semiconductors 

Whereas electrons are the only charge carriers in conductors, both 
electrons and holes serve as mobile charges in semiconductors. At finite 
temperature, electrons in the valence band can acquire enough energy to 
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jump into the conduction band, leaving empty states behind in the valence 
band. When nearly all energy levels in the valence band are occupied by 
electrons, the empty states are regarded as occupied by positive charges 
known as holes. The valence states occupied by electrons are considered to 
be empty of holes. 

A perfect semiconductor crystal without any impurities or defects is 
called an intrinsic semiconductor. In such material, thermal excitation 
generates electron-hole pairs by providing the energy required for an 
electron to leave a valence state and enter a conduction state. At thermal 
equilibrium, the electron and hole concentrations in an intrinsic 
semiconductor depend on temperature as follows: 

-(e c -e f )/ 

n = N c e /kT (1.5) 



P = N v q 



-(e f -e v ) / 
At 



( 1 . 6 ) 



where k is Boltzmann’s constant, T is the absolute temperature, E F is the 
Fermi energy, E c is the minimum energy level of the conduction band, E v is 
the maximum energy level of the valence band, and N c and N v are the 
effective densities of states in the conduction band and valence band, 
respectively. The Fermi energy is defined as the energy level at which 
electron and hole occupancies are equal; i.e., a state of that energy level is 
equally likely to be filled or empty. The effective densities of states in 
conduction band and valence band are defined as 



N c ,v=2 



^ InmkT^ 



(1.7) 



where h is Planck’s constant, and m* is the density-of-states effective mass 
(i.e., m n * is the effective mass of electrons in the conduction band for N c , and 
m p * is that of holes in the valence band for Nv). Contours of constant energy 
in the conduction band of silicon are ellipsoidal rather than spherical, so 
effective mass is not isotropic. Therefore, to find the effective density of 
states in the conduction band (Nc), the density-of-states effective mass m n * 
must first be obtained by averaging the effective masses along the 
appropriate directions: 






( 1 . 8 ) 
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where m *, m t \ ’, and m t i are the effective masses along the longitudinal and 
transverse axes of the ellipsoids. As shown in Figure 1-2 , the two highest 
valence bands near the center of the first Brillouin zone are approximately 
parabolic. The band with larger curvature is known as the light hole band 
due to its smaller effective mass, whereas the other is known as the heavy 
hole band. The density-of-states effective mass of the valence band averages 
over both bands according to 






(1.9) 



where m ih * and m hh * are the effective masses of the light holes and heavy 
holes respectively. 

Since the numbers of electrons and holes are equal in an intrinsic 
semiconductor, 



n = p = n i 



( 1 . 10 ) 



where n, is the intrinsic carrier concentration. From equations (1.5), (1.6), 
and (1.10), 

n i = pip = FFv e~ Es/2kT (1.11) 



where E g is the bandgap energy. At a temperature of 300 K, the intrinsic 
carrier concentration of Si is 1.45 x 10 10 cm” 3 [1]. 

The carrier concentrations in an intrinsic semiconductor can be 
dramatically altered by introducing special impurities known as dopants. 
Introducing dopants known as donors increases the mobile electron 
concentration, whereas introducing acceptors increases the mobile hole 
concentration. In doped semiconductors, the concentrations of electrons and 
holes are no longer equal, and the material is called extrinsic semiconductor. 
The mass-action law [6] n } = np still holds: if doping increases the 
concentration of electrons, the concentration of holes decreases (and vice 
versa). The predominant carrier is known as the majority carrier and the 
other is known as the minority carrier. A doped semiconductor is either N 
type or P type, with the majority carriers being either electrons or holes, 
respectively. 
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1.2.3 Optical generation of electron-hole pairs 

Photons may be absorbed as they travel through semiconductor material. 
Sometimes the absorbed energy excites the transition of electrons from the 
valence band to the conduction band, leaving mobile holes behind. 
Photoexcited transitions are classified as interband, intraband, or trap-to- 
band transitions depending on the initial and final energy levels; these are 
depicted in Figure 1-3 by the transitions labeled (a), (b), and (c), 
respectively. 

Interband transitions are further classified as either direct or indirect 
transitions depending on whether an auxiliary phonon is involved. When an 
electron absorbs a photon of frequency col 2tt, its energy increases by hco. 
During the absorption, both the total energy and the total momentum of the 
system must be conserved. The momentum of an incident photon (hk) is 
usually negligible compared to that of the electron. In the absence of 
additional momentum transfer, an electron transition induced by an incident 
photon may only occur between energy states with the same wavevector k . 
This is known as a direct transition (see Figure 1-3). 

In silicon, the energy minimum in the conduction band is located at 
roughly three-quarters of the distance from the Brillouin center along the 
<100> axis, whereas the energy maximum in the valence band is near the 
Brillouin zone center. As a result, a photoexcited electron transition requires 
an additional source of momentum in order to satisfy the conservation of 
energy and momentum. If the energy of the absorbed photon is the bandgap 
energy E g , the momentum required is 3nh/4a , where a is the lattice constant 
of silicon. This momentum is provided by a quantized lattice vibration called 
a phonon. A transition involving a phonon is known as an indirect transition 
(see Figure 1-3). 

1.3 Silicon-based photodetectors 

Photodetectors that transduce optical signals into electronic signals are 
increasingly important for applications in optical communication and digital 
photography. Operation of a photodetector comprises: (a) generation of free 
electron-hole pairs due to impinging light; (b) separation and collection of 
electrons and holes, possibly with current gain; and (c) production of an 
output signal through interaction with other components. The main figures 
of merit for a photodetector are sensitivity to light at the wavelength of 
interest, response speed, and device noise. In the following sections, several 
popular silicon-based photosensing devices are discussed: photoconductors, 
PN and PIN photodiodes, phototransistors, and photogates. Before 
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Figure 1-3. Schematic diagram of photoexcitation processes, (a) Interband transitions, 
including a direct transition (with no phonon) and an indirect transition (with a phonon 
involved), (b) Intraband transition, (c) Trap-to-band transition. 

examining specific devices in detail, several useful concepts will be briefly 
introduced. 

When light impinges on a semiconductor, some fraction of the original 
optical power is reflected and the rest passes into the material. Inside the 
solid, interactions between photons and electrons cause a loss of optical 
power. For a uniform semiconductor material with an absorption coefficient 
a, the light power at depth x satisfies the relationship 

P ph (x + dx)~ P ph (x) = -aP ph (x)dx (1.12) 

Therefore, the optical power traveling through the semiconductor decays 
exponentially: 

^W = -P„»(0K“ (113) 



where the surface is taken to be at x = 0 and P ph (0) is the optical power at 
the surface. If the length of the detector along the incident light direction is 
Z, the number of photons absorbed in the detector is 




hco 



(1.14) 
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Figure 1-4 illustrates absorption coefficient as a function of wavelength 
for several semiconductor materials. As wavelength decreases, the 
absorption coefficient increases steeply for direct bandgap material like 
GaAs, whereas absorption coefficients increase more gradually for indirect 
bandgap material such as Si and Ge. Photons with energy less than the 
bandgap energy E g cannot be detected through band-to-band electron 
transitions, so the maximum (or cutoff) wavelength occurs at 




(1.15) 



for photons with energy equal to E g , where c is the speed of light. For 
wavelengths near A c , a phonon is required to complete a direct transition in 
an indirect bandgap material. Hence the probability of absorption decreases 
significantly for wavelengths near A c , as shown in Figure 1-4 for Si and Ge. 

If there is no applied or built-in electric field to separate the 
photogenerated electron-hole pairs, they will recombine and emit either light 
or heat. To detect the optical signal, the photogenerated free carriers must be 
collected. To detect the signal efficiently, the free carriers must be prevented 
from recombining. The responsivity (R ph ) of a photodetector is defined as the 
ratio of induced current density to optical power density: 

R,„=^7 d-i«) 

f 



where J ph is the light-induced current density and P p h is the optical power 
per unit area of the incident light. The quantum efficiency rj is defined as the 
number of photogenerated carriers per incident photon: 




Noise levels determine the smallest optical power that can be detected. 
The noise sources of a photodetector include shot noise from signal and 
background currents, flicker noise (also known as 1 If noise), and thermal 
noise from thermal agitation of charge carriers. Noise is often characterized 
by the noise equivalent power (NEP), defined as the optical power required 
to provide a signal-to-noise ratio (SNR) of one. Although some authors 
define the NEP for a 1-Hz bandwidth, the NEP is generally a nonlinear 
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Figure 1-4. Absorption coefficient as a function of wavelength for several semiconductor 
materials. Below the cutoff wavelength, absorption increases steeply for direct bandgap 
materials such as GaAs, whereas absorption increases more gradually for indirect bandgap 
materials such as Si and Ge. (Adapted with permission from J. Wilson and J. F. B. Hawkes, 
Optoelectronics: an Introduction, Englewood Cliffs, New Jersey: Prentice Hall, 1983.) 



function of the bandwidth. The reciprocal of the NEP, known as the 
detectivity D , provides an alternative measure; a larger detectivity correlates 
with improved detector performance. Noise power usually scales with 
detector area A and bandwidth B. To normalize for these factors, the specific 
detectivity D* is defined as 



D* = 



4ab_ 

NEP 



( 1 . 18 ) 



1.3.1 Photoconductor 

The photoconductor is the simplest semiconductor light detector: it 
consists of a piece of semiconductor with ohmic contacts at both ends. 
Photons are absorbed as incident light passes through the material, thus 
generating electrons and holes that increase the conductivity of the 
semiconductor. Consequently, current flowing through the photoconductor 
in response to an applied external voltage is a function of the incident optical 
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(a) 




Figure 1-5. (a) Photoconductor structure, (b) Typical photoconductor bias circuit. 

power density P p h. Figure 1-5 depicts the structure of a photoconductor and 
a bias circuit for measurement. 

Carriers in a semiconductor recombine at the rate n{t)lr , where n(t) is the 
carrier concentration and r is the lifetime of carriers. The excess carrier 
concentration is the additional carrier concentration relative to the thermal 
equilibrium concentration, i.e., the “extra” carriers beyond those expected 
from thermal generation. The photogenerated excess carrier concentration 
decays exponentially in time after an impulse as n(t) = noexp(-t/r), where 
the excess carrier concentration at time zero is n 0 . For monochromatic light 
of constant power impinging uniformly on the surface of a photoconductor, 
the generation rate of electron-hole pairs per unit volume is proportional to 
the optical power density, whereas the recombination rate depends on the 
density of photogenerated carriers. 

The population of excess carriers increases until the photoconductor 
reaches equilibrium, i.e., the generation rate balances the recombination rate. 
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R = ^ = ^L = G=< WL (1.19) 

T T hcoWLH 

where A p and An are the excess hole and electron densities generated by 
photon absorption, and L , W , and H are the length, width and height of the 
photoconductor. These excess carriers increase the conductivity by Act, 
which is 



Acr = qjU n An + qjuAp 



( 1 . 20 ) 



where ju n is the electron mobility and ju p is the hole mobility. For an applied 
voltage of Vbias, the photogenerated current density A/ is 



v 

AJ = Acr-^ = 
L 



( 



WLt . c l /- l P 7 lPph WLt 

hcoWLH 



V 



WPphVn f 



ticoH 



ph 

hcoWLH 



- + 



v 

bias 



P P 



1 + ^l 

V PnJ 



V bias T 



( 1 . 21 ) 



The light induced current A I is the current density times the cross-sectional 
area: 



AJ = A JWH = 



tico 



l + * 

, Pn. 



I V bicJ 



( 1 . 22 ) 



If the primary photocurrent is defined as I ph = qqP ph l hco, where 
P p h = PphWL is the total optical power falling on the detector, then the light- 
induced current AJ scales with the primary photocurrent, the carrier lifetime, 
and bias voltage, and is inversely proportional to length, giving 



AI = I 



ph 









1 + ^ 
PnJ 



V biasPn T 



= 1 



ph 



l + *± 

. Pn, 



\Pn ET 



= 1 



P h 



1 + ^ 

Pn 



L 



= 1 



ph 



1 + ^ 



Pn)K 



(1.23) 
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where t r is the transit time of carriers, i.e., the average time required for a 
carrier to traverse the length of the device. The gain of a photoconductor is 
defined as the ratio of the light-induced current increment A I to the primary 
photocurrent I ph . The gain G depends on the lifetime of carriers relative to 
their transit time. For a silicon photoconductor, the gain can exceed 1000, 
which means that more than one thousand carriers cross the terminals of the 
photoconductor for each photogenerated electron-hole pair. This simple 
detector achieves the highest gain of any photo detector — up to 10 6 . The 
response is slow, however, since it depends on the lifetime of carriers. 
Although it seems counterintuitive that one photogenerated pair produces 
more than one collected current carrier, the underlying reason is simple: the 
collection of carriers does not eliminate photogenerated carriers. Only 
recombination removes them from circulation. Therefore, a photogenerated 
carrier can travel through the device many times before disappearing through 
recombination, depending on the ratio zlt r . 

To analyze the noise performance of the photoconductor, assume that the 
intensity of the incident optical power is modulated as 

P{t) = P ph (l + Me /V ' v ) (1.24) 



where M is the modulation factor and a>o is the modulation frequency. The 
continuity equation 



dn(t) _ n(t ) r/MP ph e ,ftW 
dt ~ t HcoWLH 



(1.25) 



is solved to find the modulated electron density n(t) in the photoconductor: 



ri]MP ph e jc0 °‘ 
hcoWLH (\ + j£O 0 T ) 



The modulated hole concentration has the same expression. Therefore, the 
signal current is given by 

l(t) = q(n(t) J u n+ p(t) Mp )^WH 



M ph& ^‘ 


f 1+ M 


TM„V bias _MI ph ^ 


fi + M 


T 


(1 + ;v) 


M.) 


L 2 (l+jtfy) 


M.) 


K 
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P = 

Slg 



2(1 + ^ 0 2 r 2 ) _ 



T f \ 

MI Ph y 

T r \ 



1 + ^ 

Mn) 



~l2 



(1.28) 



Noise in photoconductive detectors arises mainly from fluctuations in the 
rates of generation and recombination of electron-hole pairs; this is known 
as generation-recombination noise. For an intrinsic photoconductor of 
bandwidth B , the mean-square generation-recombination noise power is 
given by the following equation [8]: 



i 



■2 

GR 



4qBI 0 r 
(1 + cl>q t 2 ) t r 



(1.29) 



where / 0 = AI is the constant current given by equation (1.23). The thermal 
noise generated by the shunt resistance R p (from Figure 1-6) is 



— _ 4 kTB 

tj? ' 



R 



(1.30) 



Combining equations (1.28) through (1.30), the signal-to-noise power ratio 
(SNR) of a photoconductor is 



SNR 



M 2 I 



ph 



SqB 



1 + ^ 

. RnJ 



kT t r 1 / 2 2 \ 

1 H 1 1 + (O : . T ) 

q Tl 0 R p \ > 



(1.31) 



To achieve a specified SNR, the average optical power P p i, projecting onto 
the detector must be at least 



4ha)B(SNR) 



ph 



r/M 1 



1 + 



1 + 



[l + ^ 

Bn) 

M 2 kT 



2 q 2 R p B(SNR) 



j) ( 1 + ®«V) 



l 1 / 2 ) 



(1.32) 
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Figure 1-6. Noise equivalent circuit for photoconductor. R p is the shunt resistance of the 
semiconductor material, i 2 GR is the generation-recombination noise, and i\ is the thermal 
noise associated with the resistance R p . 

For an average optical power P ph as in equation (1.24), the root-mean-square 
(RMS) signal power is MP ph /4l , so the NEP may be found by setting 
SNR = 1 : 




(1.33) 



If the photoconductor is extrinsic, I 0 in equation (1.31) is the total 
current. The conductivity of the material causes a constant current to flow in 
addition to the photo-induced current, and the generation-recombination 
noise scales with the total current. The shunt resistance R p also decreases, so 
thermal noise increases as well. 

The primary advantage of the photoconductor is its high gain, and the 
primary drawbacks are its slow response and high noise. The following 
section discusses the characteristics of photodiodes, which provide lower 
noise and higher speed than photoconductors. 

1.3.2 Photodiode 



The PN junction photodiode is an important sensor for digital imaging 
because it is easy to fabricate in bulk silicon complementary metal oxide 
semiconductor (CMOS) technology, which is inexpensive and widely 
available. When light irradiates a junction diode, electron-hole pairs are 
generated everywhere. Electrons and holes generated inside the depletion 
region will be swept into the adjacent N and P regions, respectively, due to 
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Figure 1-7. Equivalent circuit for a photodiode. I ph is the photocurrent, D 0 is a diode, R s is 
the series resistance, R } is the junction resistance, Cj is the junction capacitance, R t is the load 
resistance, and V is the reverse bias voltage. 

the electric field across the junction. In addition, electrons and holes 
generated in the adjacent P and N regions may diffuse into the depletion 
region and be swept into the other side. Photogenerated carriers swept across 
the depletion layer may be detected either as photocurrent or as 
photo voltage. For high quantum efficiency, the depletion layer must be thick 
in order to absorb as many photons as possible. However, thicker depletion 
layers increase carrier transit time, resulting in an intrinsic tradeoff between 
response speed and quantum efficiency. 

The photodiode can be operated in two basic modes: photoconductive 
mode and photovoltaic mode. The equivalent circuit for both modes of 
operation is shown in Figure 1-7 , where I ph represents photocurrent, 7 ) 0 is a 
diode, R s is the series resistance, Rj is the junction resistance, Cj is the 
junction capacitance, Ri is the load resistance, and V is the reverse bias 
voltage. If the incident light power is P ph , the photocurrent I ph corresponding 
to the current source in Figure 7-7 is 



T _ wF 

ph ~ ho 



(1.34) 



where 77 is the quantum efficiency of the photodiode and co is the angular 
frequency of the incident light. In the short-circuit photoconductive mode, 
the voltage across the photodiode is zero and the external current is I ph . From 
equations (1.34) and (1.16), the sensitivity of the photodiode is determined 
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by quantum efficiency alone. Quantum efficiency is a function of the 
absorption coefficient, which is shown in Figure 1-4 as a function of the 
wavelength. The absorption coefficient for silicon is weak near the cutoff 
wavelength X c because of the indirect band transition discussed previously. 
To increase the sensitivity of a vertical photodiode, the junction is usually 
designed to be very shallow and is reverse-biased in order to widen the 
depletion region. Most of the impinging light is absorbed in the depletion 
region when the width of the depletion layer is on the order of 1 la. For a 
reverse bias voltage V (from Figure 1-7), the current I ext collected by the 
terminals is 



f - 



ph 



V + Iext(R.s + R / ) 
U T 



+ 



v-i^Ri+R,) 



R. 



(1.35) 



where 7 0 is the reverse-bias leakage current of the diode £) 0 and U T is the 
thermal voltage kT/q. The junction resistance Rj is usually large (« 10 8 Q), so 
the third term in equation (1.35) is negligible. In photovoltaic mode, carriers 
swept through the depletion layer build up a potential across the PN junction 
until the forward current of the diode balances the photocurrent. For an 
incident optical power P ph , the resulting open-circuit voltage V oc of the 
photodiode is 



V oc =U T In 



m p ph 
v ha>I 0 



+ 1 



(1.36) 



Because of the strong electric field inside the depletion region, the response 
speed of a photodiode is much faster than that of a photoconductor. Three 
factors determine the response speed of a photodiode: (a) the diffusion time 
of carriers outside the depletion layer, (b) the drift time inside the depletion 
layer, and (c) the time constant due to load resistance and parasitic diode 
capacitance. To reduce the diffusion time for carriers generated outside the 
depletion region, the junction should be formed very near the surface. To 
reduce the drift time for carriers in the depletion region, the junction should 
be strongly reverse biased so that carriers drift through at the saturation 
velocity. Strong reverse bias also minimizes parasitic capacitance of the 
junction diode. Both t he w idth of depletion layer and the carrier drift speed 
are proportional to ^\V bias | , which limits the drift-related improvement in 
response speed. 
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Figure 1-8. Photodiode equivalent circuit for noise analysis: R s is the series resistance, Rj is 
the junction resistance, C 7 is the junction capacitance, R/ is the load resistance, and R f is the 
input resistance of the next stage. The signal photocurrent is i ph ; i] h _is the shot noise due to 
the photocurrent, the background current and the dark current; and if h is the thermal noise 
associated with resistances R s , Rj, R h and R t . 



For the analysis of photodiode noise, assume that the signal is modulated 
as in equation (1.24). The average photocurrent remains the same as in 
equation (1.34). The root mean square (RMS) optical signal power is 
MP ph j4 2 and the corresponding RMS signal current is 



. _ dn Mp ph 
lph tmF 



(1.37) 



Figure 1-8 shows the equivalent circuit of a photodiode for noise analysis. 
Shot noise arises from three sources: (a) the background current I B generated 
by ambient background light unrelated to the signal, (b) the dark current I D 
resulting from thermal generation of electron-hole pairs inside the depletion 
region and from reverse saturation current of the diode, and (c) the 
photocurrent I ph . Each of these currents is generated by an independent 
random process, and therefore all contribute to the shot noise 

{sh = 2q(l ph +I B (1.38) 



where B is the bandwidth of the photodiode. Thermal noise arises from the 
parasitic resistances of the photodiode. The series resistance R s is assumed to 
be negligibly small in comparison to other resistances. Thermal noise is 
contributed by the junction resistance Rj, the load resistance Rj, and the input 
resistance Rj (if the photodiode drives another circuit with finite input 
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resistance). These parasitic resistances are modeled by an equivalent 
resistance R eq , which contributes thermal noise: 



l th ~ ** 



R 



eq 



(1.39) 



Combining equations (1.37) to (1.39), the SNR for the photodiode is 



SNR = 



U ^MpN 

2 [ hco y 

^q{l ls +In + I^B + 4kT{\lR tii )B 



(1.40) 



To obtain a specified SNR, one must use an optical power P ph equal to or 
larger than 






1 + 



M 2 / 

1 + - 



qB(SNR) 



(1.41) 



where the equivalent current I eq is 

tq =I B +I D + C l R eq 



(1.42) 



For an average optical power P ph as in equation (1.24) and an RMS signal 
power of MP ph / si 2 , the NEP of the photodiode is 



NEP- 



'JlticoB 

r/M 





M 2 / " 


X A 


1 + 


1+ eq 






qB 





(1.43) 



In the preceding discussion, it has been implicitly assumed that the 
quantum efficiency 77 is a known parameter of the device. The remainder of 
this section describes how the quantum efficiency varies with parameters 
such as surface reflectivity, absorption coefficient, the width of depletion 
layer, and the carrier diffusion length. 
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Figure 1-9. (a) Structure and (b) energy band diagram for a PIN diode. E c and E v correspond 
to conduction-band energy and valence-band energy respectively. 

This model applies to more general PN junction structures such as the 
PIN diode. The PIN diode is a variant on the simple PN junction in which 
the P-type and N-type regions are separated by a layer of intrinsic 
semiconductor. Performance measures such as quantum efficiency and 
response speed are optimized by choosing the thickness of the intrinsic layer. 
In a PIN diode, this intrinsic region is fully depleted under reverse bias, so 
the depletion width is determined primarily by geometry rather than by 
operating voltage. Figure 1-9 shows the structure of a PIN photodiode and 
its energy band diagram under reverse bias. Typically the depletion layer of 
a PN junction is thin, so photon absorption is assumed to be uniform over 
the depletion region. In contrast, the PIN diode has a much wider depletion 
region and absorption is a function of depth in the material. 

The total photocurrent consists of a drift current due to carriers generated 
inside the depletion layer and a diffusion current due to carriers generated 
outside the depletion region that diffuse into the reverse-biased junction. 
Therefore, the steady-state current density can be expressed as 

J tot - J dr + J diff (1-44) 



If the incident signal at the surface of the PIN photodiode has optical power 
P in with angular frequency co , the optical flux within the material is 





22 



Chapter 1 



0 



0 



Afico 



(1.45) 



where A is the area of the device and R is the reflection coefficient. The 
generation rate of electron-hole pairs as a function of depth x in the 
material is 



G(x) = «O 0 e~ ax (1.46) 

The drift current density and diffusion current density of a P f N photodiode 
are derived under the following assumptions: (a) the top P-type layer is 
much thinner than 1 la, so current due to carriers generated in the P + layer is 
negligible; (b) recombination within the depletion region is negligible; and 
(c) thermal generation current is negligible. Under these conditions, the drift 
current density is 

w 

J dr =q\G(x)dx = q<$>,(\-e aW ) (1.47) 

0 



where W is the width of depletion layer. The direction of drift current density 
J dr is toward the surface of the material. The diffusion current due to carriers 
generated in the bulk semiconductor near the depletion region is determined 
by the quasi-static diffusion equation 



O 



d AA x l pAAzi^ 



dx 2 



+ G(x) = 0 



(1.48) 



where D p is the diffusion coefficient of holes in the N-type semiconductor, 
r p is the lifetime of the holes, and p n0 is the minority hole concentration at 
thermal equilibrium. Boundary conditions are given by an asymptotic return 
to equilibrium value Pnofa) = p n o and the equilibrium carrier concentration 
p n (W) =Pno Qxp(-V/U T ) for applied potential V across the junction (V is 
positive for reverse bias). Equation (1.48) may then be solved to find the 
distribution of holes in the N-type bulk semiconductor 
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where L p = ^D p r p is the diffusion length of holes in the N-type 
semiconductor, and the coefficient C\ is 



C,= 



aLl 



1 - a 2 L 2 



% 

\ D p J 



(1.50) 



Normally, a relatively large reverse-bias voltage is applied across the PN 
junction to increase quantum efficiency and response speed. In this case, the 
boundary condition of hole concentration is approximately p„(W) = 0, and 
p n (x) simplifies to 



Pn{x)= '■ Pn0 J \_Pn0 +C l Q ~ aW ^ 



e ” + C, e 



(1.51) 



The diffusion current density J di jf is given by 



Jdiff - qDp ^ " 
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(1.52) 



where the minus sign indicates that current flows toward the surface, which 
is in the same direction as J dr . The total current density J tot is the sum of drift 
and diffusion current densities: 



J tot — J dr + J diff — 
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(1.53) 



Under normal operating conditions the second term is much smaller than the 
first, so the current density is proportional to the flux of incident light. 
Therefore, quantum efficiency is 
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(1.54) 



To achieve high quantum efficiency, (a) the device must have a small 
reflection coefficient R and a large absorption coefficient a, and (b) both the 
diffusion length L p and the width of depletion layer W must be large in 
comparison with 1 /a. Response speed degrades when the diffusion current is 
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a large fraction of the total current and when W is large. This imposes an 
inherent tradeoff between response speed and quantum efficiency for a 
photodiode. 

1 .3 .3 Phototr ansis tor 

In principle, all transistors are light sensitive and may be used as 
photodetectors. In practice, however, bipolar phototransistors usually exhibit 
better responsivity because the current flows through a larger volume than in 
the narrow channel of a field effect transistor. In addition, phototransistors 
can provide current gain during sensing. A bipolar phototransistor is shown 
schematically in Figure 1-10. In contrast with a conventional PNP transistor, 
a phototransistor has a large collector-base junction area for collection of 
photons. Phototransistors usually operate with the base terminal floating. 
Photogenerated holes in the reverse-biased collector-base junction will be 
swept into the collector and collected as photocurrent I ph . The emitter-base 
potential is increased by electrons generated in the base region and swept 
into the base from the collector. The increase of the emitter-base junction 
potential causes holes from the emitter to be injected into the base; most of 
these holes diffuse across the base and appear as additional collector current. 
Since the base is floating, the emitter current 1 E is equal to the collector 
current 7 C : 

Ic = Ie = I CEO = (l + Ke ) (Iph + leg) (1.55) 



where h FE is the static common emitter current gain, I eq represents the 
background current and dark current, and I CE o is the collector-emitter current 
with open base. The gain of a phototransistor is 1 + h FE . As with the 
photodiode, the current across the active junction of a phototransistor 
contributes to shot noise. With the collector current given in (1.55) and at 
low temporal frequencies, the output noise power is 



= 2 ql c 




(1.56) 



where h fe is the small-signal common-emitter current gain (approximately 
equal to the static gain h FE ), and B is the bandwidth of the phototransistor 
[9]. The net base current is zero, but internally the base current comprises 
two balanced flows: the junction and recombination currents balance the 
photocurrent, the dark current, and the background current. Therefore, both 
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Figure 1-10. Cross-sectional view of a PNP phototransistor. The collector-base junction area 
is large to increase the collection of photons. Photocurrent biases the floating base region, and 
the resulting emitter current is 1 + h FE times the photocurrent. 

balanced components of the base current contribute shot noise, each with 
spectral density 2q(I c lh FE ). When referred to the output, this appears as 
4 qhfe (J c lh FE ). In addition, the collector current contributes shot noise with 
spectral density 2ql c . Although the signal power is larger because of the 
phototransistor gain, the noise is larger as well. Given a modulated optical 
signal as in equation (1.24), the RMS photogenerated current signal is 
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(1.57) 



From equations (1.56) and (1.57), the SNR is given by 



SNR = 
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(1.58) 
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Equation (1.58) may be solved to find the minimum value of the average 
optical power P ph required to achieve a specified SNR, which is 
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(1.59) 



The NEP is given as MP™ n /Jl with SNR = 1. Under the assumption that 
hf e = h FE » 1, the NEP simplifies to 
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(1.60) 



The tradeoff between low noise and high gain can be adjusted by varying the 
common-emitter current gain h FE . The response speed of the phototransistor 
is slow in comparison with the photodiode because of the large depletion 
capacitance from the large collector-base junction area. 

1.3.4 Photogate 

The photogate is closely related to the charge-coupled device (CCD), 
discussed in further detail in the next section. A photogate detector is a 
metal-oxide-semiconductor (MOS) capacitor with polysilicon as the top 
terminal. In contrast with the photodetectors discussed above, the photogate 
transduces optical signals into stored charges rather than voltage or current 
signals. These stored charges subsequently interact with other components to 
generate voltage or current signals. The photogate operates by integrating 
the incident photosignal, so the photogate output is a filtered and sampled 
version of the incident signal. A cross-sectional view of a photogate and its 
energy band diagram are shown in Figure 1-11. When a positive voltage is 
applied to the gate above the P-type substrate, holes are pushed away from 
the surface, leaving a depletion layer consisting of ionized acceptors near the 
surface. This space charge region causes the energy bands to bend near the 
semiconductor surface. The potential y/ inside the semiconductor is defined 
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Figure 1-11. Cross-sectional view of the photogate and its energy band structure. The 
potential i//is defined as the difference between the intrinsic Fermi-level potential at any 
location and its value in the bulk semiconductor. The surface potential y/ s is defined as the 
value of the potential ^at the surface of the semiconductor. 

as the difference between the intrinsic Fermi-level potential at any location 
and its value in the bulk semiconductor, and is positive if the energy bands 
bend downward at the surface. The surface potential y/ s is defined as the 
value of the potential ^at the surface of the semiconductor. 

Neglecting potential shifts due to charges at the interface and in the oxide 
layer and the difference in the work function (the difference in Fermi energy 
between the polysilicon gate and the semiconductor substrate), the surface 
potential y/ s is zero when the applied gate voltage V G is zero. Applied gate 
voltage (V G ^ 0) appears partially across the gate oxide as V ox and partially 
across the depletion region beneath as surface potential y/ s \ 

V G =V ox + ys s ( 1 . 61 ) 

Assuming that the substrate is uniformly doped with acceptor density N A , the 
charge density on the gate capacitor must balance the ionized charges in the 
semiconductor; therefore, the potential across the parallel plate capacitor V ox 
can be expressed as 
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where C ox is the oxide capacitance per unit area and W D is the width of the 
depletion layer. Poisson’s equation 



vV(x) = 



&A 






(1.63) 



may be solved to find the surface potential y/ s : 



W s =v{ti)-Y{W D ) = 



<FaK 

2s 



(1.64) 



Canceling out W D in (1.62) and (1.64), the gate voltage V G may be rewritten 
in terms of y/ s as 



V r 






C, 



+ V S 



(1.65) 



If the photogate collects a signal charge density Q sig at the surface of the 
semiconductor, the gate voltage V G becomes 



Vr. 



pqN 4 s siV / s + Q si 



C 



~ + V s 



(1.66) 



Rearranging equation (1.66) to find the surface potential i// s in terms of the 
gate voltage and signal charge gives 



¥ S =V C + V 0 - 



Q W 
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V G - 



8s, 
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y o ^ y o 



(1.67) 



where V 0 = qN A £ si /C ox 2 . Since V 0 is usually very small compared to the gate 
bias voltage V G , the surface potential y/ s is an accurate linear function of the 
signal charge Q sig . 

The derivation of the quantum efficiency of a photogate is similar to that 
of a photodiode. The depletion region induced by the electric field due to the 
applied gate voltage is considered to be a shallow junction with junction 
depth Xj = 0; this is consistent with the assumption that the photodiode has 
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no photon absorption in the diffusion region above the junction. The 
boundary condition for minority-carrier concentration at the edge of the 
depletion region is 

n p { W o) = n p ^ /UT d-68) 



where n p o is the equilibrium concentration of electrons in the P-type 
substrate. Following the derivation of quantum efficiency for the photodiode 
and assuming that (a) the substrate thickness is much greater than both the 
diffusion length L n and the light penetration length 1 la and (b) the depletion 
width is constant, the quantum efficiency of the photogate is 



Tj = (l-R) 1 

V 



1 



1 + ocL 



(1.69) 



where L n is the diffusion length of electrons in the P-type substrate. (A more 
comprehensive derivation is available in van de Wiele [10].) The depletion 
width depends on the surface potential, which changes as signal charges are 
collected at the surface, so in the general case quantum efficiency varies 
over the charge integration interval. 

The dominant sources of noise in a photogate include dark current, shot 
noise, and transfer noise. A photogate sensor usually works in a deep- 
depletion region while the gate voltage is held at a high voltage in order to 
attract photogenerated electrons under the gate. During integration, 
thermally generated minority carriers slowly fill up the potential well at the 
surface. This thermally generated dark current limits the longest integration 
time. Thermal generation is a Poisson random process, so the dark current 
generates a shot noise with variance proportional to its mean value. The 
arrival of photons is also a Poisson random process, so the photocurrent 
contributes shot noise as well. As mentioned above, the incident light is 
transduced into surface charge in a photogate. If the optical power projecting 
on a photogate is P ph , the signal charge Q sig collected during the integration 
time t int is 

In I-/ f qpPph ( 170 ) 

\*Zsig\ 1 pdint int V L - /v J 



where rj is the quantum efficiency, hco is the photon energy, and I ph is the 
photocurrent. The shot noise generated by photocurrent I ph and dark current 
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I D has a uniform power spectral density (PSD) of 2q(I ph + I D ) A 2 /Hz. The 
root-mean-square current noise sampled at time t int is 





(1.71) 



where 1/(2^) is the bandwidth of the photogate with sampling rate l/t int . 
The charge fluctuation due to this noise current at the end of the integration 
period is given by 



Qn =yjq( I ^)/ t - = >M 7 ** +/b K' 

The SNR is obtained by combining equations (1.70) and (1.72): 



SNR = 



Q . \ 

x^sig 


2 




L 


v Qn y 


1 

^ 1 




Vint 

V 



yflCOj 



qyPph 

tico 



+ 1 , 



(1.72) 



(1.73) 



The minimum optical power P ph required to achieve a specified SNR is 



pmin 

ph 



hco(SNR) 

2 Vt int 
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(1.74) 



The equation above is rearranged with SNR = 1 to give 
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(1.75) 



The stored charges must transfer at least once to other components to 
produce an output current or voltage signal, and therefore the conductance of 
the transfer channel contributes thermal noise. The transfer noise at the 
output node is composed of this thermal noise integrated over the bandwidth, 
resulting in kT/C noise that is independent of conductance. Consequently, 
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the root-mean-square value of charge fluctuation on the output capacitance 
Ceq is 

fTC7 Q (1-76) 

The noise increases by this value as it is transferred to the output node. 

1.4 Semiconductor image sensors 

In previous sections, several devices were discussed as single 
photodetector elements. Imaging is the process of using arrays of such 
detectors to create and store images. Whereas the predominant technology 
for optical imaging applications remains the charge-coupled device (CCD), 
the active pixel sensor (APS) is quickly gaining popularity. CCD technology 
has revolutionized the field of digital imaging by enabling diverse 
applications in consumer electronics, scientific imaging, and computer 
vision through its high sensitivity, high resolution, large dynamic range, and 
large array size [11-12]. APS technology is now beginning to enable new 
applications in digital imaging by offering improved performance relative to 
CCD technology in the areas of low-power operation, high speed, and ease 
of integration. The operation and performance of a CCD imager are 
reviewed, and then the APS imager is introduced. The two technologies are 
compared to show why APS technology is an attractive alternative to CCD 
technology. Finally, the technical challenges involved in developing a good 
APS imager are discussed. More detailed information about APS design is 
available in chapter 4. 

1.4.1 Basic CCD structure and operation 

The basic element of CCD technology is similar to the photogate 
introduced in the previous section. A CCD imager consists of an array of 
closely spaced MOS capacitors operated in deep depletion on a continuous 
insulator layer over the semiconductor substrate. A schematic cross-sectional 
view of a three-phase n-channel CCD using three overlapping polysilicon 
gates is shown in Figure 1-12. Three adjacent polysilicon gates define an 
imager pixel, with the pixel boundary determined by the voltages applied to 
the different gates. The gates have the same length as the imager array and 
are responsible for storing and transferring the signal charges accumulated 
during the integration phase. Lateral confinement structures isolate signal 
charges into parallel channels along the direction of charge transfer, and are 
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Figure 1-12. Cross-sectional view of a three-phase CCD pixel using three overlapping 
poly silicon gates (A, B, and C). The pixel boundary is determined by the voltages applied 
to the gate. The dashed line indicates the depletion boundary during integration, with a high 
voltage applied to gate A. 

typically implemented using a combination of thick field oxide and light 
doping (see Figure 1-13). 

During integration, a higher gate voltage is applied to gate A than to 
gates B and C. This forces the material under gate A into deep depletion, so 
gate A serves as the signal charge collection and storage element. During the 
first phase of the three-phase clock, the voltage applied to gate B is pulsed to 
a high level while the voltage applied to gate A decreases slowly. Thus the 
charges stored under gate A are transferred into the potential well under gate 
B. In the second phase, the charges under gate B are again transferred to gate 
C by pulsing the gate voltage for gate C to a high level while decreasing the 
voltage on gate B and maintaining a low voltage on gate A. In the third 
phase, the voltage on gate A returns high while the voltage on gate C 
decreases. After one cycle of the three-phase clock, the signal charge packet 
has been transferred to the next pixel. Repeating this process results in a 
linear motion of the charge packet from the original pixel to the end of the 
row where it is measured by generating either a voltage or current signal. An 
optical image represented by the stored charge packets is obtained by 
scanning through the CCD array. 

If every pixel in the CCD imager collects photogenerated charges, the 
time required to transfer charge packets through the array significantly limits 
the speed of processing each image frame. One strategy to increase the speed 
of image readout is to dedicate pixels to storing or shifting out charge 
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Figure 1-13. Using a combination of techniques including thick field oxide and doping, 

CCD signal charges are confined laterally in parallel channels along the direction of charge 
transfer. 

packets collected by other pixels. Figure l-14(a) shows a schematic CCD 
imager using this idea to increase frame rate. However, interleaving storage 
pixels with charge collecting pixels degrades the resolution of the imager. A 
frame transfer imager addresses this problem while maintaining fast transfer 
speed. As shown in Figure l-14(b), a frame transfer imager uses two CCD 
arrays of equal size: one records the image and the other stores the image 
while a single register at the periphery reads out the stored values. The 
charge packets collected by the imaging area are transferred and temporarily 
stored in the storage area. During the next cycle of charge accumulation in 
the imaging array, the signal charges in the storage array are transferred one 
line at a time to the readout register. The frame transfer imager obtains both 
high frame rate and fine spatial resolution at the expense of larger chip size. 

1.4.2 CCD operating parameters and buried channel CCD 

Recording an image using a CCD imager comprises several processes: 
generation of charges by incident photons; collection of charges by the 
nearest potential well; transfer of charge packets through the CCD array; and 
readout by the output preamplifier. The performance of a CCD imager is 
quantified according to quantum efficiency rj, charge collection efficiency 
rjcc , charge transfer efficiency rj C T , noise, and response speed. 

Quantum efficiency is the number of charges generated per incident 
photon. CCD imagers can achieve quantum efficiencies as high as 90% 
using techniques such as thinned high-resistivity substrates, back 
illumination, and anti-reflection coatings. 

Diffusion of photogenerated charges within the substrate restricts the 
spatial resolution of CCDs. Charges generated at one pixel may diffuse to 
neighboring pixels, depending on the depth at which the charges are 
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(a) 



Light sensitive area 



Shielded Image storage area 



(b) 

Figure 1-14. Two CCD imagers that increase frame rate, (a) An imager with shielded pixels 
(represented by the gray columns) for signal packet transfer interleaved with photosensitive 
pixels. The bottom row represents a shift register that reads out the signal column-wise. 

(b) An imager with a storage array having the same size as imaging array, known as a frame 
transfer imager. 

generated. Charges generated near the front surface, where the electric field 
is high, are readily collected by the corresponding potential well. Charges 
generated deeper in the substrate experience a weak electric field, so those 
electrons may diffuse into surrounding pixels. This phenomenon is known as 
a “split event”. The diffusing charges may also be lost to trapping and 
recombination, a phenomenon known as a “partial event”. Both types of 
events cause image blurring and degraded resolution. This is especially 
important for quantitative applications such as spectroscopy, in which the 
number of charges collected by a pixel represents the energy of the 
impinging photon. Charge collection efficiency reflects the degree to which 
the charges generated by a single photon are collected by a single pixel [13]. 

CCD imagers rely on the transfer of charge packets from the location 
where they were initially generated to readout circuitry outside the array. 
The charge transfer efficiency per is defined as the ratio of charge 
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transferred (to the next electrode) to the initial charge stored (under the first 
electrode). If a packet of total charge go is transferred n times down to the 
register, the charge Q n which reaches the register is 



Qn=QK d-77) 

The three basic charge-transfer mechanisms are thermal diffusion, self- 
induced drift, and fringing field drift. For small charge packets, thermal 
diffusion is the dominant transfer mechanism. The total charge under the 
storage gate decreases exponentially with the diffusion time constant 
Tth [14], 




(1.78) 



where L is the center-to-center space between two adjacent electrodes and D n 
is the diffusion coefficient of the carrier. For large charge packets, self- 
induced drift due to electrostatic repulsion among the charges within the 
packet is the dominant transfer mechanism. The fringing electric field is 
independent of the charge intensity, so fringing field drift dominates the 
charge transfer once most of the charges have shifted. 

Three factors determine the charge transfer efficiency: dark current, finite 
charge transport speed, and interface traps. Dark current arises from thermal 
charge generation in the depletion region, minority carrier diffusion in the 
quasi-neutral diffusion region outside the depletion region, and surface 
recombination current. When a high voltage pulse is applied to an electrode, 
the surface region immediately transitions into deep depletion. While the 
gate voltage remains high, thermally generated minority carriers gradually 
fill the surface potential well, and this artifact corrupts the signal charge 
packet. Thus, the frequency of the clock must be sufficiently high to 
minimize the influence of dark current on charge transfer. At very high 
frequencies, however, the finite charge transport speed degrades the charge 
transfer efficiency. Short electrode length and high gate voltage help to 
reduce the charge transfer time. In addition, electrons are used as signal 
charges rather than holes because of their higher mobility. At intermediate 
frequencies, interface trapping of the signal charge determines the transfer 
efficiency. Figure 1-15 shows trapping and release of carriers from interface 
traps. As the charge packet enters a potential well, empty traps are filled 
with signal charges immediately. Some of trapped charges are released 
quickly and continue on with the correct charge packet as it transfers to the 
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Figure 1-15. Illustration of transfer noise introduced by interface traps. 

next electrode. Other interface traps have much slower time constants of 
release, so the trapped carriers are released into subsequent packets. This 
delayed release of carriers produces a charge loss from the first packet to the 
tail of a sequence. Together these mechanisms result in charge transfer 
inefficiency, which causes signal distortion and phase delay. Interaction of 
the signal charge packets with interface traps can be decreased significantly 
by maintaining a constant background charge so that interface traps remain 
filled. This background charge is called a “fat zero”. The drawback of using 
a fat zero is reduction in dynamic range. Another method for avoiding 
charge transfer inefficiency due to interface traps is buried channel CCD 
(BCCD) technology. The transfer inefficiency (s C t = 1 - r/ c f) of a BCCD is 
an order of magnitude smaller than that of a surface channel CCD (SCCD) 
with the same geometry. 

The noise floor determines the smallest charge packet that can be 
detected. It is an important factor in determining the smallest pixel size. 
Several noise sources limit the performance of a CCD imager. At high signal 
levels, fixed-pattern noise (FPN) dominates; this noise source arises from 
pixel-to-pixel variation within the array and can be reduced by adjusting the 
voltages of the clocks. At intermediate signal levels, shot noise due to the 
signal current limits the performance. At the lowest signal levels, dark 
current, fat zero, and amplifier noise limit the performance of the CCD 
imager. 

For further details about CCD devices and imagers, excellent references 
are available [14-17]. 
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Figure 1-16. Two APS photocircuits using (a) a photodiode and (b) a photogate as the 
photodetector. The components inside the dashed-line boxes constitute the pixels. 



1.4.3 Active pixel sensors 

CMOS-based active pixel sensors (APS) have been studied since the 
1980s as an alternative to CCD technology. An APS is an imager in which 
every pixel includes at least one active transistor. Transistors in the APS 
pixel may operate as both amplifier and buffer in order to isolate the 
photogenerated charge from the large capacitance of the common output 
line. While any photodetector may be used, APS pixels often use 
photodiodes or photogates. Figure 1-16 illustrates two common 
photocircuits (circuits inside the pixel) that use photogate and photodiode 
detectors respectively. APS pixels normally operate in a charge integration 
mode, but may operate in a continuous mode as well. The operation of 
different pixels will be discussed further in the following section. The 
differences between CCD and APS systems will be highlighted here. 

Both CCD and APS systems are based on silicon technology, so they 
have similar sensitivities for visible and infrared wavelengths. CCD imagers 
require a nearly perfect charge transfer from one electrode to the next. The 
necessity for nearly perfect charge transfer efficiency becomes clear when 
the fraction of a charge packet that reaches the output of the CCD imager is 
examined. A typical three-phase CCD pixel has three electrodes, so a signal 
packet may be shifted several thousand times on average for an imager with 
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2048 x 2048 pixels. For a charge transfer efficiency of 99.99%, 26.5% of 
charges in the original packet will be lost in the array before measurement. 
Consequently, CCD imagers are (a) radiation soft (i.e., sensitive to 
displacement damage caused by radiation); (b) difficult to fabricate in large 
arrays; (c) difficult to integrate with other on-chip electronics; and (d) 
difficult to operate at very high frequency [18]. Specialized fabrication 
processes offer improved performance for CCD imagers, but at three times 
the expense of standard CMOS technology. On the other hand, APS imagers 
eliminate macroscopic transfer of the charge packet. Thus, charge transfer 
efficiency is of limited importance, and APS technology avoids the 
disadvantages associated with maximizing charge transfer efficiency. 

Though technically feasible, it is usually impractical to integrate 
auxiliary functions such as clock drivers, logic control, and signal processing 
together with the CCD imager. CCD imagers are therefore multi-chip 
systems, with the resulting large size, heavy weight, and high cost. The 
primary motivation for developing CMOS-based imagers such as APS is the 
ease of integration. Virtually all camera electronics such as timing and 
control signal generators, analog-to-digital converters (ADC), and analog 
reference generating circuits can be implemented on the same substrate as 
the imager array. This leads to APS imagers that are “camera-on-a-chip” 
systems, with digital interfaces that facilitate integration with external 
systems. As a result, APS imager systems offer compact size, light weight, 
low cost, and low power consumption. A CMOS APS requires only one- 
hundredth the power of a CCD system. In addition, APS technology will 
benefit from the continuous process improvement of mainstream CMOS 
technology. Other advantages of APS systems include compatibility with 
transistor-to-transistor logic (TTL) (0-5 V), readout windowing, random 
access to pixels, and variable integration time. 

Despite these advantages and despite twenty years of investigation in 
APS technology, CCD still remains the dominant imaging technology. Since 
its inception in 1970, CCD has improved and matured significantly. Using 
specialized fabrication processes, the integrity of the individual charge 
packets is maintained as they are physically transferred across the chip to an 
output amplifier. CCDs provide images with excellent quality for 
wavelengths ranging from X-rays to infrared radiation. To compete with 
CCDs, APS technology must offer the same image quality while still 
retaining its apparent advantages. 

APS imagers suffer from fixed pattern noise (FPN) introduced by the 
mismatch of active transistors in different pixels. This noise can be 
ameliorated by using the technique of correlated double sampling (CDS). 
CDS removes systematic offsets by comparing samples taken from the same 
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pixel before and after integration. Typically, there is a CDS circuit for each 
column, which reduces the FPN for pixels in the same column to an 
insignificant level. However, column-to-column FPN must be suppressed 
using other strategies. APS systems also suffer from poor resolution due to a 
low fill factor. (The fill factor is the ratio of photosensing area in each pixel 
to the total pixel area.) Typically each pixel uses three active transistors, 
which limits the fill factor when using standard CMOS processes. Scaling 
down the feature size improves the fill factor, but at the cost of higher noise 
and reduced dynamic range due to lower power supplies. 

1.5 Information rate 

The preceding sections discussed the semiconductor physics underlying 
phototransduction, several photodetectors, and typical pixel structures for 
silicon-based imagers. Whereas CCDs are inherently charge-mode devices, 
APS pixels may use many different methods to convert photogenerated 
carriers into an electronic output signal. This section will consider the 
functional performance of candidate APS pixels, as evidenced by their 
abilities to represent information about incident light intensity. 

1.5.1 Information 

Imagers are arrays of sensors that transduce optical signals into electronic 
signals. This transduction is essentially the same as communicating the 
optical signals while transforming them from one physical variable (light 
intensity) into another (charge, current, or voltage). Information is a 
fundamental measure of communication that allows fair comparisons of 
performance among different technologies and signal representations. The 
pixel may be considered a communication channel, with the input signal 
provided by the incident optical signal and the output signal produced by 
transducing the input signal into an electronic signal. The information rate 
will be calculated for several different pixels that transduce the optical input 
signal into an electronic output in the form of charge, current, or voltage. 

Information scales logarithmically with the number of possible messages 
if those messages are equally likely, and reflects the uncertainty in the 
outcome of the communication — the higher the number of possible 
outcomes, the more that can be learned from the communication and thus the 
more information that can be conveyed. The information transmitted through 
an input/output mapping is known as the mutual information I (X; 7). This is 
the information that can be learned about the input by observing the output, 
or vice versa. Mutual information is a measure of dependence between the 
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two signals X and Y. Given the joint probability distribution p (x, y) and the 
marginal distributions p (x) and p (y), the mutual information rate of signals 
X and Y is computed as the average uncertainty of the joint distribution 
relative to the product distribution p (x) p (y): 



I ( x '> Y ) = l!p( x ’y) lo z 2 

*>y 



pCy) 

p(*)p(y ) 



(1.79) 



The rate R of information flow at the output is given by the information per 
sample I (X; Y) times the sample generation rate f s : 

R = f s l(X;Y) (1.80) 



Information capacity is a fundamental and quantitative bound on the 
ability of a physical system to communicate information [19]. Information 
capacity is defined as the maximum mutual information that can be 
communicated through a channel. This capacity depends only on the 
physical properties of the channel, such as bandwidth, noise, and constraints 
on the signal values; it does not depend on the specific details of particular 
tasks for which the channel may be used. Although task-dependent measures 
of performance are common in engineering, it is appealing to study the 
maximum information rate, or channel capacity, especially for sensory 
devices such as photodetectors that are used for many different tasks. 

A Gaussian channel is an additive noise channel in which the noise is a 
random Gaussian process. For a Gaussian channel, the transmission 
bandwidth and the ratio of the signal power to the noise power are sufficient 
to determine the capacity of the channel to transmit information. For a 
Gaussian channel constrained by an input signal power S, the channel 
capacity is given by the following relation [20]: 



C = A/log 2 




(1.81) 



where S and N are the signal power and noise power, respectively, A/ is the 
bandwidth of the channel, and the capacity C is in units of bits per second. 

In the next few sections, the optical information capacity of APS pixels 
that transduce the optical signal into charge, current, or voltage will be 
compared. The transduced representations of the optical signal may be 
classified as either sampled or continuous. Sampled pixels operate 
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(a) (b) (c) 

Figure 1-17. Three APS pixels: (a) charge-mode (QM) pixel (sampled); (b) current-mode 
(IM) pixel (continuous); and (c) voltage-mode (VM) pixel (continuous). 

essentially by integrating the photocurrent onto a capacitor during a 
sampling period, then resetting the capacitor, and then integrating again 
during the next sampling period. In contrast, continuous pixels transduce the 
photocurrent waveform continuously into output current or voltage signals. 
The three APS pixel structures shown in Figure 1-17 are considered here: 
charge-mode pixel (sampled), current-mode pixel (continuous), and voltage- 
mode pixel (continuous). 

Each of these pixels operates together with a readout circuit, as indicated 
in Figure 1-18. The readout circuitry inevitably contributes additional 
readout noise to the signal of interest. Figure 1-1 8(a) shows a general 
architecture that places all readout circuitry at the column level. As indicated 
in Figure l-18(b ), sometimes this readout circuitry is split into the pixel 
level and the column level. The latter circuit shows a transistor within the 
pixel operating as a source follower, with the current source shared over the 
entire column. In order to focus on the essential characteristics of the 
different pixel structures under study, the noise contributions of the readout 
circuits are neglected in the following discussion. 

To compute the information rates for these three pixel structures, assume 
that the input is an intensity modulated optical signal with average 
photocurrent I ph and total signal power cr 2 I ph 2 . The variable I B accounts for 
any constant current through the photodiode that is unrelated to the incident 
optical signal, such as dark current and current due to background light. The 
contrast power of the incident photosignal is the variance a 2 of the 
normalized photosignal (i.e., the optical signal normalized by its mean 
value). The normalized photosignal is independent of the mean illumination 
level and shows variations due to scene details, such as the relative 
reflectivity of objects within a scene [21-22]. For a sinusoidally modulated 
signal as in equation (1.24), the contrast power is a 2 = M 2 ! 2. 
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Figure 1-18. Schematics of imager pixels (inside the dashed lines) and the corresponding 
circuitry for two readout strategies, (a) All readout circuitry is at the column level, (b) The 
readout circuitry is split into pixel-level and column-level components. 



1.5.2 Charge-mode pixels 

The charge-mode pixel is the basic unit of the CMOS active pixel sensor 
(APS). The charge-mode pixel shown in Figure 1-17 (a) consists of a 
photodiode and a reset transistor. In an imager, these elements are 
augmented by a source follower and a switch for row selection within the 
pixel. A current source for the source follower and a correlated double 
sampling (CDS) readout circuit are shared by pixels in the same column (as 
discussed in the previous section). The operation of the charge-mode pixel 
has three stages: reset, integration and readout. In the reset stage, the reset 
transistor is turned on and the detecting node is initialized to a high value. 
During integration, the photocurrent collected by the photodiode is 
accumulated by discharging the detecting node. After the integration period, 
the pixel value is read out by the CDS circuit and stored at the column level. 
The cycle starts again by resetting the pixel and beginning to accumulate 
photocurrent. The stored values in each column are read out before the end 
of the next cycle. 

The input-referred noise may be calculated separately for each reset and 
integration cycle and then summed to determine the total input-referred 
noise (neglecting readout noise as discussed above). 

The reset period is normally much shorter than the settling time for the 
active pixel sensor; therefore, steady-state operation is not achieved and the 
reset noise power must be calculated using temporal analysis as in Tian et al. 
[23]. The mean-square noise voltage at the end of the reset period is 
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(K ~ ( i -t t hf 



where t r is the reset time, t\ is the time that the reset transistor operates in the 
above threshold region, and t th is the time required to charge the detecting 
node capacitance C out up to the thermal voltage (using the current of the reset 
transistor at the point where it enters subthreshold operation). Since the reset 
time is usually several microseconds and therefore three orders of magnitude 
larger than t\ and t th , the mean-square noise voltage is approximately equal to 
kT/(2C out ). 

During integration, the reset transistor is turned off. The only noise 
source is the shot noise from the photodiode. If it is assumed that the 
capacitance at the detecting node remains constant during integration, the 
mean-square noise voltage sampled at the end of integration will be 



K (',„)=■ 






(1.83) 



where t int is the integration time. 

Thus, the information rate of the charge-mode pixel is given by 



I A/T 



-log 2 1- 



2r2 

CJ s 1 ph ^2 

c 2 

^ out 






(1.84) 



where l/(lt int ) is the bandwidth of the pixel with sampling rate l/t int . The 
reset time t r can be neglected since it is normally much shorter than the 
integration time t int . Thus, the information rate of the charge-mode pixel is a 
function of the integration time t inh the average photocurrent I ph , the 
capacitance of the detecting node C ouh the absolute temperature T, and the 
background current I B . If 1 / (2 t int ) is considered to be the bandwidth A/ of the 
charge-mode pixel, equation (1.84) may be rewritten as 



= A/log 2 



(2kTC out Af + 2q(l ph+ I Bi 



(1.85) 
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1.5.3 Current-mode pixels 

The current-mode pixel shown in Figure 1- 17(b) directly transduces the 
photocurrent generated by the photodiode into a current signal. The pixel 
consists solely of a photodiode and serves as a point of comparison with the 
continuous voltage-mode pixel and the sampled charge-mode pixel. In 
practice, this photodiode is used with a switch for row selection within the 
pixel, and possibly with some other current steering or amplification 
components as well. 

Since the photodiode contributes shot noise of 2 q (I ph + I B ) A/ and the 
active load circuitry (not shown in Figure 1- 17(b)) also contributes a 
variance of 2 q (I ph + I B ) A/ the mean-square noise current is 

C = Aq(l ph +l B )b.f ( 1 . 86 ) 



where A/is the bandwidth of the measurement. Thus, the information rate of 
the current-mode pixel is given by 



Am A f 1°§2 



1 + 



__ 2 j 2 

a s*ph 



^ c l{/ph + h)*f 



(1.87) 



which is a function of the bandwidth A/ the average photocurrent I ph , and 
the background current I B . If the currents (I ph and 1 B ) and contrast power ( cr s 2 ) 
are fixed, the information rate is a monotonic function of frequency 
bandwidth and approaches its maximum value as the frequency bandwidth 
increases without bound: 



j max 

^ IM 



= lim A/log 2 

A/ — »oo 



¥ 



In 2 



( 1 . 88 ) 



The current-mode pixel shown in Figure 1- 17(b) is an idealization 
intended to illustrate the performance associated with direct measurement of 
a photocurrent. Some active load is clearly required to communicate the 
photocurrent to the outside world. One implementation for such an active 
load is a source-follower transistor, as shown in Figure 1-19. In this case, the 
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Figure 1-19. Logarithmic voltage-mode pixel (continuous). 

pixel produces an output voltage that is logarithmically related to the input 
photocurrent. The input photocurrent and the shot noise are identical to that 
for the current-mode pixel described above, and both are transferred into the 
signal and noise components of the voltage output. The transfer functions for 
signal and noise are identical, so the information rate for the logarithmic 
voltage-mode pixel is the same as that given in equation (1.87) for the 
current-mode pixel. 

1.5.4 Voltage-mode pixels 

The linear voltage-mode pixel shown in Figure 1 -17(c) converts the 
photocurrent into an output voltage signal using a linear resistor. The mean- 
square signal voltage is given by 

K s =°F ph F (1-89) 

Since the photodiode contributes a variance of 2q(I ph + I B )R 2 Af and the 
resistance contributes a variance of 4kTRAf, the mean-square noise voltage is 
given by 

vj = [AkTR + 2q(l ph +I B )R 2 yf (1 .90) 
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The information rate of the linear voltage-mode pixel given in (1.91) is a 
monotonic function of bandwidth A/ when other variables are fixed, and 
approaches a maximum value as the bandwidth increases without bound: 



j max 
I VM 



= lim A/log 2 

A/->oo 



1 + 











{^R + 2q\ 




ii? 2 ) 


w 



^2 T 2 

G s^ph 



In 2 AkT/ R + 2q(l 



ph 




(1.92) 



1.5.5 Comparison 

Unlike the current-mode and linear voltage-mode pixels, the information 
rate of the charge-mode pixel is not a monotonic function of the 
measurement bandwidth A f. Figure 1-20 shows the information rate of the 
charge-mode pixel as integration time varies; the information exhibits a 
maximum at a finite integration time. This result apparently contradicts the 
intuitive idea that high-quality images require long integration times (under 
the implicit assumption that the scene is static during the integration time). 
However, information rate reflects both the quality of the picture and the 
temporal variation of the scene. Integration times that are too short result in 
poor signal-to-noise ratios, whereas integration times that are too long 
sacrifice details regarding changes in the scene. In contrast, information rates 
for the current-mode and voltage-mode pixels increase monotonically as the 
bandwidth increases to infinity. 

The information rate also varies with other parameters of the pixels. The 
information rate of a charge-mode pixel decreases with increasing 
capacitance at the detecting node. The information rate of the linear voltage- 
mode pixel decreases with decreasing resistance according to equation 
(1.91). Increasing temperature reduces the information rate for both charge- 
mode and voltage-mode pixels, and increasing background current I B 
(including dark current and photocurrent unrelated to the signal) reduces the 
information rate for all pixels. 

In Figures 1-21 and 1-22 , the information rates of the three pixels are 
compared (a) for a fixed bandwidth of 1/(2 x 30 ms) = 16.7 Hz as the 
photocurrent varies, and (b) for an optimal bandwidth as the photocurrent 
varies (with limiting values of detecting node capacitance for the charge- 
mode pixel and of resistance for the linear voltage-mode pixel). The first 
case corresponds to an integration time of 30 milliseconds, which is typical 
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Figure 1-20. The information rate of a charge-mode pixel as a function of integration time t int 
for an average photocurrent of 100 fA, a background current of 2 fA, a detecting node 
capacitance of 10 fF, a temperature of 300 K, and a contrast power of 0.1. 

for video applications. The second case illustrates the limiting behavior as 
the photocurrent varies. 

The first comparison is of the information rates for fixed measurement 
bandwidth and integration time, as given by equations (1.85), (1.87), and 
(1.91). In this case, each of the pixels has two distinct regimes of behavior as 
a function of photocurrent. 

At sufficiently low values of the photocurrent (e.g., I ph « I B ), the I ph term 
in the denominator is negligible compared to the other terms, and the 
information rates of the three pixels can be approximated as 



i = ¥ iog 2 



i+ 



—.2 j 2 

£ 7 _s_ph 

aAf 



(1.93) 



where a is independent of photocurrent and takes different values for the 
three pixel structures under study: a QM = 2kTC out Af + 2qI B , a IM = 4 qI B , and 
a V M = 4kT/R + 2 qI B . 

For sufficiently high photocurrents, the term I ph dominates the other 
terms in the denominator, and the information rates of the three pixels can be 
approximated as 
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Figure 1-21. Information rates of the charge-mode pixel (QM), the current-mode pixel (IM), 
and the linear voltage -mode pixel (VM) as functions of the photocurrent for a measurement 
bandwidth of 16.7 Hz. Pixel background current is 2 fA, temperature is 300 K, contrast power 
is 0.1, detecting node capacitance is 10 fF, and resistance is 300 GQ. 



i = ¥ iog 2 



a: I 



ph 



bAf 



(1.94) 



where b is independent of photocurrent and takes different values for the 
three pixel structures under study: b QM = 2 q, b IM = ^q, and b V M = 2 q. At very 
low values of the photocurrent, information increases quadratically with 
photocurrent. For intermediate photocurrents, information increases in 
proportion to log (I P h) = 21og (I P h)\ for large photocurrents, it increases as 
log (I P h). These properties of the information rates for the three pixel 
structures are shown in the S-shaped curves of Figure 1-21. For very low 
photocurrents I ph and for small I B , the information rate of the current-mode 
pixel is the largest of the three. For large photocurrents, the current-mode 
pixel has the smallest information rate, because the shot noise contributed by 
the active load of this pixel is (in this case) larger than both the reset noise of 
the charge-mode pixel and the thermal noise of the voltage-mode pixel. 

The second comparison is of idealized pixels at limiting values of 
detecting node capacitance and load resistance, giving zero reset noise and 
zero thermal noise for the charge-mode pixel and the linear voltage-mode 
pixel, respectively. Under these conditions, the information rates of all three 
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Figure 1-22. The maximum information rates of charge-mode pixel (QM), current-mode pixel 
(IM), and linear voltage -mode pixel (VM R ) as functions of the photocurrent. 

pixels increase monotonically as measurement bandwidth increases. Figure 
1-22 shows the information rates for infinite bandwidth as a function of 
photocurrent. This figure shows that the information rates for the charge- 
mode pixel and the voltage-mode pixel are identical when reset noise and 
thermal noise are neglected. In addition, the optimal information rate 
increases as I P h for small photocurrents and as I ph for large photocurrents. 

1.6 Summary 

To understand semiconductor phototransduction, relevant concepts from 
semiconductor physics have been reviewed. As an indirect bandgap material, 
silicon is not an ideal material for photodetection because of its low 
absorption coefficient. However, the maturity of silicon fabrication 
technology and silicon integrated circuitry development makes silicon the 
most popular material for image sensors. 

Several photodetectors relevant to digital photography were examined 
with respect to performance metrics such as responsivity, noise, and 
response time. The response speeds of all the photodetectors discussed here 
are adequate for imaging applications; therefore, the primary focus was on 
their relative quantum efficiency and NEP, which reflect responsivity and 
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device noise, respectively. Photoconductors and phototransistors have gains 
much larger than one, and consequently have higher noise than other 
photodetectors such as photodiodes and photogates. In addition, the 
conductivity of on-chip resistors is subject to significant process variation, 
so photoconductors are rarely used in image sensors. Choices among the 
remaining three photodetectors depend on factors such as light intensity, 
imaging frame rate, and pixel size. 

Photodiodes and photogates are compatible with standard CMOS 
fabrication technology, so they are the most popular choices for APS image 
sensors. Although CCD image sensors are most common in digital 
photography, APS image sensors are gaining popularity because of their low 
voltage operation, low power consumption, highly compact size, and low 
cost. The information rates of APS pixels were compared by considering 
them as Gaussian channels through which optical signals are transduced and 
communicated. All pixels show similar information rates at 30 frames per 
second, and show similar trends in the information capacity as the 
photocurrent varies. 
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Abstract: The modulation transfer function (MTF) of an optical or electro-optical device 

is one of the most significant factors determining the image quality. 
Unfortunately, characterization of the MTF of the semiconductor-based focal 
plane arrays (FPA) has typically been one of the more difficult and error-prone 
performance testing procedures. Based on a thorough analysis of experimental 
data, a unified model has been developed for estimation of the overall CMOS 
active pixel sensor (APS) MTF for scalable CMOS technologies. The model 
covers the physical diffusion effect together with the influence of the pixel 
active area geometrical shape. Agreement is excellent between the results 
predicted by the model and the MTF calculated from the point spread function 
(PSF) measurements of an actual pixel. This fit confirms the hypothesis that 
the active area shape and the photocarrier diffusion effect are the determining 
factors of the overall CMOS APS MTF behavior, thus allowing the extraction 
of the minority-carrier diffusion length. Section 2.2 presents the details of the 
experimental measurements and the data acquisition method. Section 2.3 
describes the physical analysis performed on the acquired data, including the 
fitting of the data and the relevant parameter derivation methods. Section 2.4 
presents a computer model that empirically produces the PSF of the pixel. The 
comparisons between the modeled data and the actual scanned results are 
discussed in Section 2.5. Section 2.6 summarizes the chapter. 

Key words: CMOS image sensor, active pixel sensor (APS), modulation transfer function 

(MTF), point spread function (PSF), diffusion process, parameter estimation, 
modeling. 

2.1 Introduction 

Recent advances in CMOS technology have made smaller pixel design 
possible. A smaller pixel can improve the spatial resolution of an imager by 
increasing its pixel count. However, the imager is then more susceptible to 
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carrier crosstalk, which works against the spatial resolution. In addition, 
smaller pixels tend to have an inferior signal-to-noise ratio (SNR) because 
the photon flux that they receive is reduced. Charge collection and pixel 
spatial resolution analysis are therefore important in designing a smaller 
pixel. 

The modulation transfer function (MTF) is the most widely used spatial 
resolution index. It is defined as the transfer ratio between the imager input 
and output signal modulations as a function of the spatial frequency of the 
input signal. Several MTF models have been developed for CCD imagers. 
Numerous factors affecting MTF have been discussed, including carrier 
diffusion, epi-layer thickness, substrate doping, and others [1-9]. However, 
these models do not work for CMOS imagers very well. Most CMOS 
imagers use photodiodes for charge collection, and the collection mechanism 
of photodiodes differs from that of the potential well used by CCDs. Only a 
portion of the charges is collected in the region enclosed by the photodiode. 
Considerable numbers of charge carriers are generated in the photodiode 
surroundings, and because of the smaller fill factor, these can be collected by 
any nearby diode. An applicable model for charge collection MTF is needed 
for efficient designs. 

This chapter provides a logical extension of a pixel response analysis by 
Yadid-Pecht [10]. The pixel photoresponse is analyzed and a comprehensive 
MTF model is described, enabling a reliable estimate of the degradation of 
the imager performance. 



2.1.1 Optical and modulation transfer functions 

The optical transfer function (OTF) determines the output for any given 
input. It is defined in a way similar to the Fourier transform of the spread 
function in electronics; i.e., it is the “impulse” response normalized to its 
own maximum value, which occurs at zero spatial frequency: 



OTF =r[co x ,(O y ) = 



jf (*', y') exp [-/' [cd x x' + co y y ') dx'dy' 

00 

IN x', y') dx'dy' 



S ( co co ) 

V V = MTF *exp [ jPTF] 

S' (0,0) L J 

( 2 . 1 ) 



where S(co x , C 0 y) is the Fourier transform of the spread function or impulse 
response s(x,y). The magnitude of the OTF is called the modulation transfer 
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function. It is a measure of how well the system accurately reproduces the 
scene. In other words, MTF is a measure of the ability of an imaging 
component (or system) to transfer the spatial modulation from the object to 
the image plane. Spatial modulation of light irradiance is related to image 
quality. If there is no such modulation, irradiance is uniform and there is no 
image. The highest spatial frequency that can be accurately reproduced is the 
system cutoff frequency. The maximum frequency an imager can detect 
without aliasing is defined as the Nyquist frequency, which is equal to one 
over two times the pixel pitch, i.e., 1/(2 x pixel pitch). 

The phase transform function (PTF) is not nearly as important to 
resolution as MTF, but nevertheless it often cannot be neglected [11]. The 
spatial phase determines the position and orientation of the image rather than 
the detail size. If a target is displaced bodily in the image plane such that 
each part of the target image is displaced by the same amount, the target 
image is not distorted. However if portions of the image are displaced more 
or less then other portions, then the target image is distorted. This 
information is contained in the PTF. 

The physical implications of the OTF are very analogous to those of the 
electrical transfer functions. Both permit determination of the output for any 
given input. In both cases, the Fourier transform of a delta function is a 
constant (unity) that contains all frequencies — temporal in electronics and 
spatial in optics — at constant (unity) amplitude. Therefore, a delta function 
input permits an indication of the frequency response of the system, that is, 
which frequencies pass through the system unattenuated and which 
frequencies are attenuated and by how much. A perfect imaging system (or 
any other transform system) requires resemblance between input and output; 
this would require infinite bandwidth so that an impulse function input 
would result in an impulse function output. Therefore, to obtain a point 
image for a point object, an infinite spatial frequency bandwidth is required 
of the imaging system. Physically, this means that because of the diffraction 
effects of optical elements such as camera lenses or pixels, these elements 
must be of infinite diameter to eliminate diffraction at edges. This, of course, 
is not practical. As a consequence, the smaller the aperture the greater the 
relative amount of light diffracted and the poorer the image quality. 

The mathematical concept of the transfer function is accompanied by a 
real physical phenomenon called “contrast”. For a one-dimensional 
sinusoidally varying object wave, 

I ob (*) = b + a cos 2n / x0 v 
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Figure 2-1. One-dimensional object irradiance. 

The modulation contrast object (MCO), which describes the contrast in the 
object plane, is defined as 



MCO — max ~ 1 ° min a 
I o max+ 1 o min b 



( 2 . 2 ) 



In the image plane, 



Image ( X ')= j I OB (x) S (x - x') dx (2.3) 



Thus, 
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(2.4) 



where S(f x0 ) is the Fourier transform of the spread function, and S(0) is the 
Fourier transform for the DC-level spread function. For a linear imaging 
system, the image of a cosine is thus a cosine of the same spatial frequency, 
with possible changes in the phase of the cosine O (f x0 ), which imply changes 
in position. The modulation contrast image (MCI), i.e., the contrast function 
in the image plane, is 
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MCI ~ //M max ~ 7 ^ min - %. t ( 2 5) 

I IM max+ I IM min S(0) 

The modulation contract function (MCF) or the contrast transfer function 
(CTF) is defined as the ratio of the image (output) modulation contrast to 
that of the object (input). For a cosine object, 



MCF = CTF = 



m 



= MTF 



( 2 . 6 ) 



The MTF characteristics of a sensor thereby determine the upper limits of 
the image quality, i.e., the image resolution or sharpness in terms of contrast 
as a function of spatial frequency, normalized to unity at zero spatial 
frequency. 




Figure 2-2. A typical MTF curve indicating which frequencies pass through the system 
unattenuated and which frequencies are attenuated and by how much. 

2.1.2 Point spread function: the dependence on pixel size 

The MTF varies in general as a function of illumination wavelength and 
can vary as a function of illumination intensity. Therefore, the conditions 
and techniques used for MTF measurement should be chosen based on the 
ultimate application of the focal plane array (FPA). A detailed analysis and 
comparison of the various MTF techniques was performed by T. Dutton, 
T. Lomheim et al. [12]. 
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The point spread function (PSF, also known in the electro-optical 
community as the aperture response profile) is the spatial analogue of the 
two-dimensional MTF. A direct measurement of the PSF provides maximum 
insight into the vertical layer structure of the photosensor layout. It permits 
the experimental determination of the photodiode aperture response behavior 
(the geometric MTF) and the APS depletion/diffusion structure onset (the 
diffusion MTF). 

To measure the PSF, it is important to determine the size at which an 
object can be considered a point object. The object-pixel is the smallest 
element recordable in the image space. The brightness value represents 
average irradiance over that small portion of the image scene. Object-pixel 
size is often related to the detector size. If only a portion of the detector is 
illuminated, the output current is equivalent to that obtained for the same 
total radiant power absorbed by the detector but averaged over the entire 
detector area. No detail smaller than that object-pixel can be resolved. 
Because of mechanical, atmospheric, and detector imperfections in the 
image system, the actual spread function obtained from the measured object- 
pixel is usually larger than the true object-pixel; thus, it represents an overall 
system response or spread function. Note that the object-pixel size strongly 
affects the system MTF. If an object-pixel represents a point image, then the 
pixel size and shape determines the minimum spread function. A best-case 
transform function for such an imaging system is thus a normalized Fourier 
transform of the pixel shape. For a rectangular pixel array, the MTF is 
determined then by the pixel (and pitch) dimensions by the well-known sine 
formula: 

p a . ka x 

i 2 i 2 sin { — ) 

H(w) = - f h(x)e jkx dx = - fl -e ]kx dx = - , 2 (2.7) 

P { P t P ( ka ) 

2 2 2 

where p is the pitch size, a is the sensor size, k is the angular frequency, and 
h(x ) is the impulse response. In this example, a maximum value occurs at 
(2a)” 1 , when the sine function reaches its first zero. As a decreases, the pixel 
size MTF broadens. A decrease in pixel size means that smaller details can 
be resolved, corresponding to a larger frequency bandwidth. The smaller the 
pixel dimensions, the larger the spatial-frequency bandwidth. A good tool 
for measuring the MTF of the detector or of the system (taking into 
consideration the imperfections of the light propagation channel) is a point 
source corresponding to the image plane size. 
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2.1.3 CMOS APS MTF modeling: a preview 

Solid-state imagers are based upon rectangular arrays of light-sensitive 
imaging sites, also called picture elements or pixels. In CMOS APS arrays, 
the pixel area is constructed of two functional parts. The first part, that has a 
certain geometrical shape, is the sensing element itself: the active area that 
absorbs the illumination energy within it and turns that energy into charge 
carriers. Active pixel sensors usually consist of photodiode or photogate 
arrays [13-16] in a silicon substrate. Each imaging site has a depletion 
region of several micrometers near the silicon surface. Perfect collection 
efficiency is assumed for carriers at or within the depletion region, and 
therefore any photocarrier generated in this depletion region is collected at 
the imaging site. The second part of the pixel area is the control circuitry 
required for readout of the collected charge. The fill factor (FF) for APS 
pixels is less than 100 percent, in contrast to CCDs where the FF can 
approach 100%. The preferred shape of the active area of a pixel is square. 
However, designing the active area as a square can reduce the fill factor. 
Since it influences the signal and the SNR, the fill factor should be as high as 
possible. Figure 2-3 shows a pixel with an F-shaped active area, which is the 
type most commonly used. 

Photon absorption in the silicon depends on the absorption coefficient a, 
which is a function of the wavelength. Blue light (with wavelengths of 
X « 0.4 pm) is strongly absorbed in the first few micrometers of silicon, 
since a is large in this spectral region. Fonger wavelengths (such as 
X « 0.6 pm) have a smaller absorption coefficient, which means more of the 
photocarriers can be generated outside the depletion regions. Before they are 
lost to a bulk recombination process, these carriers can diffuse to the original 
imaging site or to a nearby site where they are collected. However, the 
imagers lose resolution as the result of this diffusion process. Figure 2-4 
shows a schematic cross-section of several imager sites, indicating the 
depletion-region boundaries. 

Theoretical calculations to model the effects of the photogenerated 
minority carriers’ spatial quantization, transfer efficiency and crosstalk in 
CCDs and CISs have been described over the years, [1-9]. It has always 
been assumed that the theoretical model includes the solution for the 
continuity equation 



DV 2 n + — = Nf -a exp \-az] 

T 



( 2 . 8 ) 
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Figure 2-3. Example layout of a pixel design with an L-shaped active area. 

where n represents the minority carriers concentration, Nf is the flux 
transmitted to the substrate of the sensor at some position jc, a is the 
absorption coefficient, z the depth within the semiconductor, r the minority 
carrier lifetime and D the diffusion coefficient. 

The standard diffusion MTF models must be modified to account for 
modem layer stmcture as discussed by Blouke and Robinson [3] and Stevens 
and Lavine [7]. The latter researchers developed an approach wherein the 
pixel aperture and diffusion components of the MTF are treated in a unified 
manner. It was shown that the multiplication of the diffusion and aperture 
MTF for the purpose of arriving at the overall sensor MTF is only valid for 
the special case in which the pixel aperture is equal to its size or pitch, i.e., 
for a 100% fill factor. In other words, the unification of the geometrical and 
diffusion effects is necessary for the correct MTF representation, i.e., 
MTF U = MTF g *MTF di ff. Lin, Mathur and Chang [17] show that the CCD- 
based diffusion models do not work well for APS imagers, since the latter 
devices have field-free regions between and surrounding the pixel 
photodiodes that contribute to diffusion currents. 
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Figure 2-4. A schematic cross-section of several imager sites, illustrating the dependence of 
the diffusion distance on the photodiode geometry. 

This chapter is a logical extension of a theoretical analysis recently 
presented by Yadid-Pecht [10], in which the theoretical MTF for the active- 
area shape of a general pixel was calculated and compared with 
experimental data. It was shown that the active-area shape contributes 
significantly to the behavior of the overall MTF of a CMOS imager. 

The general expression for the MTF for the connected shape pixel is 
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where 



m = = and ao = 0 

/ 4 Aj • (aj — cij - 1 ) 

j= i 

(See Figure 2-5, left side.) 

In the analysis for L-shaped active areas [10], four parameters are 
inserted: a u a 2 , A\ and A 2 . In general, A x is the modulation amplitude and a x 
is the active area length. For a 2-D array, a multiplication of the MTF in both 
directions is required. To calculate the MTF in the y direction, a simple 
variable change is performed. The modulation amplitude in the x direction is 
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Figure 2-5. (Left) Schematic description of the L-shaped detector. (Right) Schematic 
description of the L-shaped parameters in the y direction (after [10]). 

actually the length in the y direction, while the length in the x direction is the 
amplitude modulation in they direction (see Figure 2-5 , right side). 

In that work, analysis of the MTF for the general pixel active area shape 
was considered. An analytical solution was derived for the most commonly 
used shape in practical pixel designs, the L-shaped detector. The actual PSF 
was obtained experimentally via sub-pixel scanning, and the MTF was 
calculated accordingly from the measurements for the different pixel 
designs. 

A good general agreement was found between this calculated MTF and 
the theoretical expectations. The differences that remained between the 
actual and simulated MTFs were mostly due to other factors in the design 
and process, such as diffusion and crosstalk (see Figure 2-6). 

Here we present a more comprehensive model, which takes into account 
the effect of the minority-carrier diffusion together with the effect of the 
pixel active-area shape on the overall CMOS-APS MTF. This is especially 
important for APS design, where the fill factor is always less than 100%. 

2.2 Experimental details 

Our model is based on the measurements of responsivity variation on a 
subpixel scale for the various APS designs. These measurements were 
reported in [10] and will be described here briefly. An optical spot 
approximately 0.5 pm in diameter (He-Ne laser, X = 0.6 pm) was used to 
scan the APS over a single pixel and its immediate neighbors in a raster 
fashion. This work dealt only with the ideal situation, where light was 
projected directly onto the sensor pixel. Therefore, effects on the MTF due 
to the presence of an optical stack of oxides such as light piping [18] were 
not considered. In addition, the spot size of the laser was small compared to 
the pixel size; therefore, the effects of laser spot profile on the measured PSF 
were not considered. 
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Figure 2-6. (Left) MTF obtained from the scanned PSF. (Right) MTF obtained with the 
analytical geometrical model. 

The data acquisition was taken at the center point; i.e., the central pixel 
was read out at each point of the scan (see Figure 2-7). The signal obtained 
as a function of the spot position provided a map of the pixel response. 

Only the central pixel is assumed to be active. The “window region”, i.e., the 
photodiode (or photogate) of the active pixel is the only region where the 
wide (non-zero bias) depletion layer exists and the photocarrier collection 
occurs. The incoming laser light generates (at a certain depth according to 
the exponential absorption law) electron-hole pairs, i.e., minority charge 
carriers. Diffusion of these charge carriers occurs with equal probability in 
all directions, with some diffusing directly to the depletion region where 
they subsequently contribute to the signal. Therefore, the PSF obtained in 
the window region (see Figure 2-8 ) is due to the detection of those charge 
carriers that successfully diffused to the depletion region. The value at each 
point represents the electrical outcome of the three-dimensional photocarrier 
diffusion (i.e., the integration over the depth at each point) from that point to 
the depletion. Thus, the 2-D signal map plane obtained in the experiment can 
be generally considered as a “diffusion map” of the 3-D diffusion in the 
device. 



2.3 Physical analysis 

The PSF data obtained from the actual pixel measurements for a square, a 
rectangular, and an L-shaped active area were examined using scientific 
analysis software. Note that without the limitation of generality, this chapter 
demonstrates only the results for an L-shaped active area pixel as an 
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Figure 2-7. Geometry of the generalized experiment. The squares represent the APS subarray. 
The optical spot (the small black square) was scanned over the array in a raster fashion within 
a specified region (the shaded squares). 




Figure 2-8. Plot of the actual measured PSF for the L-shaped pixel design (after [10]). The 
lightest areas indicate the strongest response. 
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example. In the following well-known steady-state one-dimensional 
transport equation of excess minority carriers in the semiconductor, 



drip 

dt 



n P ~ n p o 



+ D, 



d\ 

n dx 2 



( 2 . 10 ) 



the electron recombination rate can be approximated by (n p - n p0 )/T n . Here n p 
is the minority carrier density, n p0 is the thermal equilibrium minority carrier 
density, r n is the electron (minority) lifetime, and D n is the electron diffusion 
coefficient. The solution is given by: 



n p (x,t) 



N 

sfixDj 



exp 



4 Dt 



+ n 



p o 



( 2 . 11 ) 



where N is the number of electrons generated per unit area. 

Based on this solution, the PSF data acquired from each actual scanning 
was fitted. For each scanned pixel a set of fittings was performed. Figure 2-9 
indicates examples of planes on which fitting was performed for the re- 
shaped pixel. A’ — A” is one of the cross-sections on which a fit was 
performed. 

All the pixels are situated on a common substrate. The photocarrier 
diffusion behavior within the substrate is therefore the same across a given 
pixel array. From the generalization of the fitting results for all pixels, the 
common functional dependence is derived with common parameters that 
describe the diffusion process in the array. The two-dimensional approach 
described earlier allows the desired physical parameters that correspond to 
the actual three-dimensional diffusion to be obtained. 

The model used for fitting is 



T = T 0 +- 



W 



2(x-x r ) 2 

exp[ — ^ 1] 
W 2 



( 2 . 12 ) 



where x c is the center, W the width, and A the area. The term “-1” describes 
the case of r = t. It is generally possible to perform fittings with values of t 
equal to various fractions of r, so that the best correspondence is obtained 
for each window shape. The relevant parameters were derived from a 
comparison of equations (2.11) and (2.12): 




66 



Chapter 2 




Figure 2-9. Plot of the actual measured PSF for the L-shaped pixel design (after [10]). 
Cross-sections used for fitting are located along the arrows, normal to the layout surface. 
The lightest areas indicate the strongest response, as in Figure 2-8. 



W = 2j2p2, => L eff =2C2 ( 2 - 13 ) 

where L eff is the characteristic effective diffusion length. The common 
average result obtained with this approach was 24 pm. 

Figure 2-10 shows the curve-fitting function and the actual data points 
for an example cross-section. 
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— Actual data 




Figure 2-10. Functional analysis of the measured point spread function. The example data 
correspond to the A’ — A” cross-section in Figure 2-9. 

2.4 The unified model description 

Based on the analysis described for the experimental data and the actual 
layout of the pixel array, a unified numerical model was constructed that 
included both the effect of photocarrier diffusion within the substrate and the 
effect of the pixel sampling aperture shape and size within the pixel array. 
This model produces the PSF of the pixel empirically. The extracted 
parameters are used for the creation of a 2-D symmetrical kernel matrix 
(since there is no diffusion direction priority within the uniform silicon 
substrate). The convolution of this matrix with the matrix representing the 
pure geometrical active area shape artificially produces the response 
distribution in the spatial domain. Note that the spread function obtained in 
this way creates a unified PSF; i.e., this method enables modeling of the 
pixel spatial response, which can subsequently be compared with the 
genuine PSF obtained by real measurements. 

The dimensions of the kernel matrix are important. Figure 2-11 explains 
the rationale for choosing these dimensions. At the points corresponding to 
the kernel dimension (i.e., points 7 to 9), a minimum is reached for both the 
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Figure 2-11. Dependence of the mean and standard deviations on the kernel matrix 
dimensions. 



mean and standard deviation functions obtained from the comparison 
between the modeled PSF and the scanned PSF. 

In both cases, these kernel matrix dimensions equal the physical pixel 
size used in this representation. Thus, we conclude that the diffusion occurs 
primarily within the pixel, i.e., L eff « 24.4 pm. The value for this parameter 
that was here obtained directly from the model is the same as the one 
previously obtained analytically by fitting the data. 

The experimental results have been compared with predictions for 
several practical designs. A fair correspondence of the simulated PSF with 
the measured PSF was generally obtained. A thorough discussion of the 
results follows in the next section. 

2.5 Results and discussion 

To compare the PSF and the MTF of practical pixels with square, 
rectangular and L-shaped active areas, the two-dimensional MTF of these 
cases were calculated, simulated, and compared with the measurements. The 
measurements currently used for the analysis were obtained with relatively 
old CMOS 1.2 pm process chips as described in [10]. However, the analysis 
presented here is general in nature, and so a similar MTF behavioral trend is 
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Figure 2-12. The MTF contour plot calculated from the PSF obtained by laser scanning of the 
L-shaped pixel design (after [10]). 

expected for scalable CMOS processes. The design consisted of an APS 
sensor with differently shaped pixel active areas: a square-shaped active area 
with a fill factor of about 8%, a rectangular-shaped active area with a fill 
factor of 31%, and an L-shaped design with a fill factor of around 55%. 

An example of the L-shaped layout is shown in Figure 2-3. Figure 2-8 
shows the corresponding point spread function map obtained by laser 
scanning in subpixel resolution (after [10]). Figure 2-12 shows the 
corresponding MTF, calculated via a 2-D Fourier transform. 

Figure 2-13 represents the PSF map obtained from the unified computer 
model and Figure 2-14 shows the corresponding MTF contour plot. 

Figure 2-15 and Figure 2-16 show the comparisons of the model and 
actual PSF plots. Figure 2-15 gives the difference between the measured PSF 
and the pure geometrical PSF (i.e., the PSF resulting when the active-area 
pixel response is represented by unity in the active area and zero otherwise 
[10]) for a specific pixel. The maximum difference observed is about 20% of 
the maximum pixel response. Figure 2-16 compares the measured PSF and 
the PSF obtained from the unified model, which takes into account the 
diffusion effects. The maximum difference observed is about 3% of the 
maximum pixel response. 

Table 2-1 compares the extracted diffusion length values and confirms 
the universality of the method described in this chapter. The calculations are 
based on data extracted from relevant literature sources. The same model can 
be used for any process design. 
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Figure 2-15. The difference between the pure geometrical PSF and the PSF obtained by 
scanning the L-shaped pixel design. The lightest areas indicate the strongest response. 
The maximum difference is about 20% of the maximum pixel response. 




Figure 2-16. The difference between the unified model PSF and the PSF obtained by 
scanning the L-shaped pixel design. The lightest areas indicate the strongest response. 
The maximum difference is about 3% of the maximum pixel response. 
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Table 2-1. Comparison of extracted diffusion length values. 



Diffusion length 


Value obtained (pm) 


Extracted by function fitting 


-24 


Extracted by kernel optimization 


-24.4 


Calculated L ef f= 4 D n r 


-27 


(D„ = 37.5 cm 2 /sec (300K) [13]; 




t= 20 psec [14]) 





It has already been shown [10] that the active-area shape contributes 
significantly to the behavior of the overall MTF. However, there are 
essential differences between the actual and geometrical MTFs. The unified 
model presented here gives better agreement between the modeled and the 
actually measured PSF and MTF. Some difference can still be seen (Figure 
2-16) between the actually scanned and the modeled PSF matrices. 
However, these differences are confined to the background level, and on 
average are calculated to be less than 1%. They occur due to other factors in 
the design and process, such as optical crosstalk [18]. Optical crosstalk 
results from interference in the oxide level, especially between metal lines, 
and has an effect on the overall MTF [1, 4, 6]. This factor would have a 
larger effect as the pixel size scales in multi-level metal processes [19]. 

In addition, the effects of the laser spot profile (used for scanning) on the 
resulting PSF should be considered for smaller pixels. 

2.6 Summary 

Based on the analysis of subpixel scanning sensitivity maps, a unified 
model for estimating the MTF of a CMOS-APS solid-state image sensor was 
developed. This model includes the effect of photocarrier diffusion within 
the substrate in addition to the effects of in the pixel sampling aperture shape 
and size. 

Minority-carrier diffusion length, which is characteristic for the process, 
was extracted for various active-area pixels via several different methods. 

The comparison of the simulation results with the MTF calculated from 
the PSF direct measurements of actual pixels confirmed that the two 
determining factors that affect the overall MTF behavior were the active- 
area shape and the minority-carrier diffusion effect. 

The results also indicated that a reliable estimate of the degradation of 
image performance is possible for any pixel active-area shape; therefore, the 
tradeoffs between conflicting requirements, such as signal-to-noise ratio and 
MTF, can be compared for each pixel design and better overall sensor 
performance can ultimately be achieved. 

The unified model enables a design-enabling optimal-pixel operation (in 
the MTF sense) based on readily available process and design data. Thus, the 
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model can be used as a predictive tool for design optimization in each 
potential application. 

The proposed model is general in nature. However, evolving 
technologies will cause stronger scaling effects, which will necessitate 
further model enhancements. In addition, the unique submicron scanning 
system [20, 21] will allow exploration of the strong wavelength dependence 
of the diffusion component of the MTF, and it is hoped that aperture and 
lateral diffusion effects can be separated via empirical measurements. 
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Abstract: A semi-analytical model has been developed for the estimation of the 

photoresponse of a photodiode -based CMOS active pixel sensor (APS). This 
model, based on a thorough analysis of experimental data, incorporates the 
effects of substrate diffusion as well as geometrical shape and size of the 
photodiode active area. It describes the dependence of pixel response on 
integration photocarriers and on conversion gain. The model also demonstrates 
that the tradeoff between these two conflicting factors gives rise to an 
optimum geometry, enabling the extraction of a maximum photoresponse. The 
dependence of the parameters on the process and design data is discussed, and 
the degree of accuracy for the photoresponse modeling is assessed. 

Key words: CMOS image sensor, active pixel sensor (APS), diffusion process, quantum 

efficiency, parameter estimation, optimization, modeling. 



3.1 Introduction 

This work is a logical continuation of the pixel response analysis 
published by Yadid-Pecht et al. [1]. In this chapter, a pixel photoresponse is 
analyzed and quantified to provide the information necessary for its 
optimization. A novel way for the determination and prediction of the 
imager quantum efficiency (QE) is also presented. In this method, the QE is 
broadly interpreted to be dependent on the process and design data, i.e., on 
the pixel geometrical shape and fill factor. It is worth noting that QE, which 
is one of the main figures of merit for imagers, has been considered a whole 
pixel characteristic without any specific attention to the internal pixel 
geometry. However, it is useful to divide it into the main and diffusion parts. 
Even though the active area has the most effect on the output, the substrate 
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parts could account for up to 50% of the total output signal. The derived 
expression exhibits excellent agreement with the actual measurements 
obtained from a 256 x 256 CMOS active pixel sensor (APS) imager. The 
simplicity and the accuracy of the model make it a suitable candidate for 
implementation in photoresponse simulation of CMOS photodiode arrays. 

This chapter presents the proposed photoresponse model and shows the 
correspondence between the theory and the experimental data. The model is 
then used to predict a design that will enable maximum response in different 
scalable CMOS technologies. 

3.1.1 General remarks on photoresponse 

Single-chip electronic cameras can be fabricated using a standard CMOS 
process; these cameras have the advantages of high levels of integration, low 
cost, and low power consumption [2-13]. However, getting an acceptable 
response from the currently available CMOS structures is a major difficulty 
with this technology. 

A particular problem encountered by designers of vision chips in 
standard CMOS technologies is that foundries do not deliver 
characterization data for the available photosensors. Thus, designers are 
forced to use simplistic behavioral models based on the idealized 
descriptions of CMOS -compatible photosensors descriptions. This may be 
enough for chips intended to process binary images. However, the 
characteristics of real devices largely differ from their idealized descriptions, 
being, on the other hand, strongly dependent on the fabrication process. 
Consequently, using these idealized models yields very inaccurate results 
whenever the analog content of the incoming light (regarding both intensity 
and wavelength) is significant for signal processing; i.e., chips whose 
behavior is anticipated on the basis of such simplified models most likely 
will not accomplish the specifications. In this scenario, designers of vision 
chips are confronted by the necessity to fully characterize the CMOS- 
compatible photosensors by themselves. This is not a simple undertaking 
and requires expertise that is not common among chip designers. This 
chapter is intended to help designers acquire the skills to accomplish this 
task. 

In order to predict the performance of an image sensor, a detailed 
understanding is required of the photocurrent collection mechanism in the 
photodiodes that comprise the array. 

The response of the detectors can be affected by different parameters that 
are dependent on the process or on the layout. Process-dependent parameters 
(such as junction depths and doping concentration) cannot be modified by 
the designer, whereas layout-dependent parameters (detector size and shape) 
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are under the designer’s control. Both types of parameters have to be 
considered when designing a test chip for characterizing visible-light 
detectors. All possible devices or structures available in the technology to 
detect visible light have to be included in the chip so that their actual 
behavior can be measured. In addition, these detectors should each be tested 
in different sizes and shapes to account for the layout-dependent parameters. 
Each detector should ideally be tested in as many sizes as possible, with a 
range of different perimeters (shapes) for each size. Furthermore, the chip 
should include as many copies of each detector configuration as possible to 
obtain statistical information on their response. Since detectors must be 
connected to bonding pads to have direct access to the detected signal, the 
cost of fabrication of such a chip would be very high. 

A more realistic approach would use a limited number of discrete 
detectors with different sizes and shapes, but still covering all the possible 
device structures; several arrays of these detectors would be tested to 
provide the statistical information. Accordingly, the experimental data in this 
chapter were acquired from several APS chips fabricated in standard, 
scalable CMOS technology processes (e.g., standard 0.5 pm and 0.35 pm 
CMOS processes). Various topologies of the photosensitive area were 
implemented. All the pixels had a common traditional three-transistor type 
of readout circuitry (see Figure 3-1), enabling behavior identification for the 
different pixel types. Deviations in the device geometry were demonstrated 
to affect overall performance and thus these dependencies can be used as a 
predictive tool for design optimization. 



VckS 




output 



Figure 3-1. Traditional pixel readout circuitry. Transistor M2 is used as a source follower and 
the M2 gate is used to read out the photodiode voltage. 
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When light is incident on a diode (APS systems usually consist of 
photodiode or photogate arrays on a silicon substrate), photons absorbed in 
the depletion region of the p-n junction create electron-hole pairs that are 
separated by the high internal electric field. These new charges can be 
detected as an increase in the reverse current of the device or as a change in 
the voltage across it. Carriers generated in the bulk of the silicon, if they are 
less than a minority-carrier diffusion length away from the depletion region, 
can also contribute to the detected signal and thus increase the sensitive 
volume of the detector. The collection of photocarriers along the lateral edge 
of the photodiode is known as the peripheral photoresponse or the lateral 
photocurrent [14-17]. The overall signal generated by a pixel is therefore 
proportional to its geometry; i.e., the signal grows as photodiode dimensions 
increase. The total signal can be represented as the sum of the main 
(photodiode) response and the periphery response (due to the successfully 
collected diffusion carriers). 

Figure 3-2 schematically shows cross-section of imager sites and 
indicates their depletion-region boundaries. 

The response of these detectors is also affected by other parameters, such 
as technology scaling and detector size and shape. It is important to 
characterize all possible detectors for a range of sizes and shapes 
(independent of the circuitry) to obtain the photogenerated current levels for 
each case. All possible detectors also have to be characterized for each 
technology in order to find the optimum device and match the requirements 
of the signal processing circuitry. 

In the case of CMOS APS, the charge-to-voltage conversion gain is 
typically dominated by the photodiode junction capacitance, which is 
composed of the bottom and sidewall capacitances. This capacitance can be 
expressed as 
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(3.1) 



where 

- Cjdep represents the depletion capacitance of the p-n junction; 

- Cjob and C JOsw represents zero-bias capacitances of the bottom and the 
sidewall components, respectively; 

- Vd is the voltage applied to the photodiode; 

- (p B and q>Bsw stand for the build-in potential of the bottom and the 
sidewalls respectively; 
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Figure 3-2. Cross-sections of imager sites and their depletion-region boundaries. 

- mj and m jsw stand for the grading coefficients of the bottom and the 
sidewalls respectively; 

- A d represents the photodiode area (bottom component); and 

- P D represents the photodiode perimeter. 

The conversion factor, C gain = q Q \/C Jdep (in pV/e“), is inversely propor- 
tional to the pixel geometry. The factor q e \ is the electron charge. 

The basic transfer characteristics of a photodetector are usually described 
by its quantum efficiency, which depends on the wavelength and describes 
the number of electrons generated and collected per incident photon. The 
quantum efficiency is related to the spectral response (in AAV) according to 
the equation: 

SR(X) = ^-(X) ( 3 . 2 ) 

he 

If a photodiode is exposed to a spectral power density 0(/l), the collected 
photocharge can be expressed as 

Q = A^t mt \SR(X)Q>{X)dX ( 3 . 3 ) 



with A e ff denoting the effective photoactive area of a pixel and t mi the 
integration time. Illumination is assumed to be constant during the exposure 
time. 

The voltage swing that is obtained from the collected photocharge is 
inversely proportional to the integration capacitance, C int (which equals the 
depletion capacitance C jdep in CMOS APS), as follows: 
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Figure 3-3. A sample from an otherwise identical square-shaped pixel set with decreasing 
active area (photodiode) dimensions. The photodiode areas in the full pixel set vary from 
40 pm 2 down to 5.5 pm 2 and their perimeters from 23 pm down to 9.3 pm. 

Consequently, pixel signal output is proportional to the product of the 
integrated photocarriers and the conversion gain. The tradeoff between these 
two conflicting parameters is most important. Indeed, it has already been 
shown that a lower active area contributes to a higher output signal. This is 
mainly due to an increase in conversion gain, but the active-area 
surroundings probably have an affect as well [1, 5, 10, 18]. However, a 
lower active area reduces the fill factor, which directly influences the signal 
and the signal-to-noise ratio. Both the active area and the fill factor should 
therefore be as large as possible. 

3.2 Photoresponse model 

Figure 3-3 shows a sample from a set of pixels of identical square shape 
but decreasing active area (photodiode) dimensions. The photodiode areas 
vary from 40 pm 2 down to 5.5 pm 2 , and their perimeters vary from 23 pm 
down to 9.3 pm. 

Figure 3-4 displays the signal output obtained for illumination at three 
different wavelengths with the various photodiode geometries of the pixel 
set represented in Figure 3-3. 

The curves share the same behavior, each displaying a pronounced 
maximum in the response. Pixel sets of different photodiode active-area 
geometrical shapes were tested, including circular, rectangular, and L- 
shaped. For each shape, a similar phenomenon was observed: each type of 
pixel gives a maximum response for particular illumination conditions. Note 
that for each wavelength (A was changed from 450 nm to 650 nm) the 
measurements were performed under uniform light (an integrating sphere 
was used). The tradeoff between the two conflicting factors, integrated 
photocarriers and conversion gain, give rise to an optimum geometry and 
thus the maximum photoresponse. 
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Figure 3-4. The signal output of the pixel set represented in Figure 3-3 with illumination at 
three different wavelengths and with changes in the linear dimensions of the photodiodes. 

A semi-analytical expression has been derived for a diffusion-limited 
pixel photosignal in a closely spaced photodiode array. In this expression 
each one of the parameters determining the signal output depends on the 
photodiode area and perimeter: 






N 



= (integration photocarriers x conversion gain) 



pA, 



k x A + k 2 Pd 



S-A 



APi-P 



8 L 



dtff J 



k 2 A + k A P 



(3.5) 



The left part of this equation, V out (A)/N p x , corresponds to the pixel output 
voltage signal related to the number of incoming photons (in a time unit). 

On the right side of equation (3.5), the denominator represents the 
conversion factor and the numerator represents the integrated photocarriers. 
The term in the denominator, \l{ki,A + fc^P), represents the conversion gain, 
which depends on both the photodiode area and perimeter. The numerator 
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Figure 3-5. The sidewall or lateral collecting surface, built up from the junction bottom and 
sidewall depletion capacitances. P (in pm) is the photodiode perimeter; P x and P y are its 
dimensions in the x and j directions; d (in pm) is the junction depth; and dP x and dP y are the 
lateral depletion stretches. 

consists of two terms that contribute to the total number of carriers collected 
by the imager. The first term, k x A , represents the contribution of the 
photocarriers created within the photodiode itself, i.e., the contribution of the 
active area. The second term 



k 2 Pd 



' S-A ' 
V s J 



f 

1 - 

V 



4Pi - P ' 

% L diff J 



represents the contribution of the periphery or the “lateral diffusion current”, 
i.e., the carriers that had been created in the area surrounding the photodiode, 
had successfully diffused towards the photodiode, and had been collected. 

Pd (in pm 2 ) represents the lateral collecting surface or the interaction 
cross-section for lateral diffusion, where P (in pm) is the photodiode 
perimeter, and d (in pm) is the junction depth. Since 8P X «P X and 
dP y « P y (where dP x and dP y are the lateral depletion stretches of the 
photodiode), 8P X and dP y can be neglected. Therefore, it can be assumed that 
the perimeter P (in pm) itself defines the boundary of lateral interaction area 
(see Figure 5-5). 

The term ((S - A)/S) is dimensionless. Since the optical generation rate is 
relatively uniform throughout the substrate, this multiplier is proportional to 
the relative number of carriers created within the pixel substrate around the 
photodiode. Here, (S-A) (in pm 2 ) represents the substrate area, i.e., the 
unoccupied area surrounding the photodiode within the pixel, whereas A 
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Figure 3-6. A schematic illustration of the boundary condition where the active area is 
increased to the maximum possible and reaches the pixel boundary (i.e., A = S for all the 
pixels in the array). When this occurs, the diffusion contribution will equal zero. 

(in pm 2 ) is the photodiode active area. The lower the fill factor, the higher 
the number of the carriers created within the photodiode surroundings that 
can diffuse and contribute to the total signal. It is clear that this multiplier 
represents the array influence on the pixel photodiode, and the boundary 
condition, i.e., as the active area increases to the pixel boundary and A = S is 
approached for all the pixels in array, the diffusion contribution to the signal 
approaches zero. Figure 3-6 illustrates the situation where the diffusion 
equals zero as a result of maximum expansion of the photodiode area for all 
pixels in the array. Therefore, the multiplier ((*S-,4)AS) represents the 
relative number of the potential diffusion contributors. 

The term (7 - ( 4Pi - P)/8L di f f ) is dimensionless and indicates the 
approximate relative distance that the prospective contributor has to pass 
before it is trapped in the depletion region. In this term, L dif f (in pm) is the 
characteristic diffusion length and Pi (in pm) is the pixel pitch. As the 
photodiode dimensions and the perimeter P (in pm) increase, the maximum 
distance that a carrier created within the substrate has to diffuse before it is 
collected by the peripheral sidewall collecting surface Pd decreases. Thus, 
the diffusion contribution increases. This multiplier is obtained from a series 
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Figure 3-7. A schematic cross-section of two imager sites, illustrating the dependence of 
diffusion distance on the photodiode geometry. Pi is the pixel pitch. 



expansion of the expression exp[-(4Pi - P)/SLdiff\ 9 which represents the 
solution to the one-dimensional diffusion equation [19-21]. Since the 
distances between the photodiodes in the array — and therefore the maximum 
carrier path — are small compared to the minority carrier diffusion length, it 
is sufficient to consider only the first two terms of the expansion. Figure 3-7 
illustrates the dependence of diffusion distance on the photodiode geometry. 

Also in equation (3.5), V(X) (in volts) is the pixel signal output for a 
particular wavelength. 

N p x (in photons/sec) is the photon irradiance. Since watts are joules per 
second, one watt of monochromatic radiation at wavelength X corresponds to 
N p x photons per second. The general expression is 



— = 5.03xl0 15 J P,/l 

dt 1 



where is in watts and X is in nm. 

The coefficient k\ (in pm" 2 ) describes the unit active-area contribution to 
the total number of electrons collected by the imager, i.e., the number of 
electrons collected by the unit photodiode area in a time unit. 

The coefficient k 2 (in pm" 2 ) describes the photodiode unit peripheral-area 
contribution to the total number of electrons collected by the imager, i.e., the 
number of electrons collected by the sidewall collecting surface within the 
substrate. Figure 3-5 illustrates the sidewall or lateral collecting surface, 
which is built up from junction bottom and sidewall depletion capacitances. 

The coefficients k 2 and k 4 (in aF-pm” 2 and aF-pm” 1 , respectively) describe 
the bottom and sidewall capacitances in the regular junction capacitance 
sense (see equation (3.1)) and are defined by the particular process data. 

In summary, the diffusion contribution to the overall signal is 
proportional to the lateral collecting area, the number of possible 
contributors, and the distance that the carrier has to pass before its collection 
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by the photodiode. All parameters, with the exception of the coefficients k\ 
and k 2 , are defined by the process and design data. The expression agrees 
well with experimental results, as shown in the following section. 

3.3 Comparison with experimental results 

We have performed a functional analysis of experimental data obtained 
from a 14 x 14 pm pitch CMOS APS array by means of scientific analysis 
software. Pixel sets with square, rectangular, circular, and L-shaped active 
areas of different sizes have been tested. The response was analyzed for 
different wavelengths in the visible spectrum. Example results for a 
photodiode pixel with a square-shaped active area have been presented here. 

The solution (in the minimum variance sense) of equation (3.5) for 
different pixel sets and different wavelengths enabled the extraction of the 
missing coefficients k\ and k 2 . The combination of these coefficients 
determined the contributions to the total pixel output signal and remained 
constant for all pixels at certain wavelength exposures. 

It is evident from Figure 3-4 that a longer wavelength enabled better 
response; for example, the signal obtained at 570 nm was almost two times 
the one obtained at 490 nm. This can be related to better absorption of red 
visible radiation within the semiconductor depth and thus better quantum 
efficiency. Moreover, the curves in Figure 3-4 are shifted approximately in 
parallel. Since this output change was defined only by k\ and k 2 (all other 
terms are wavelength independent), it was to be expected that the 
coefficients are wavelength dependent and that there was an increase in their 
values for longer wavelengths. As mentioned earlier, these coefficients 
represented the number of electrons collected by the photodiode via its upper 
and lateral faces respectively. Since the output signal was normalized to the 
number of photons impinging the pixel for uniform incident illumination, k\ 
and k 2 was interpreted as the quantum efficiency per unit area for the upper 
and lateral faces respectively. 

As the output signal for 570 nm was approximately double that for 
490 nm, it was predicted that the QE for 570 nm would be double that the 
QE for 490 nm. The values obtained for the coefficients k\ and k 2 confirmed 
this result: k x /k 2 (570 nm) « 0.468/0.229, and k x /k 2 (490 nm) « 0.215/0.107. 

It is worth noting that QE, which is one of the main figures of merit for 
imagers, was considered only as a whole-pixel characteristic; i.e., no specific 
attention was paid to the photodiode shape and fill factor [1, 13, 22-23]. The 
work in this chapter demonstrates the merit of dividing QE into its main and 
diffusion portions according to the pixel geometrical shape and fill factor. 
This chapter also introduces a method (solving equation (3.5)) for the 
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Figure 3-8. Comparisons at two different wavelengths of the derived expression, equation 
(3.5), with the actual pixel measurements presented in Figure 3-3 and Figure 3-4. 

determination and prediction of QE based on the process and design data. 
The result obtained for the presented pixel sets is k\lk 2 ~ 2.7. This 
demonstrates that even though the active area is the primary source of the 
output, the substrate could account for up to 50% of the total output signal. 

A 3-D graph presented in Figure 3-8 shows example comparisons at two 
different wavelengths of the derived expression, equation (3.5), and the 
actual pixel measurements presented in Figure 3-3 and in Figure 3-4. It 
illustrates the correspondence between the measured and the modeled output 
signal as the pixel area and perimeter are varied. Furthermore, the modeled 
function reaches its maximum exactly at the point marked by the 
measurements, confirming the proposed assumption that a compromise 
between integration photocarriers and conversion gain results in an optimum 
photodiode geometry. Thus, the model enables the successful prediction of a 
geometry that provides maximum output signal. Based on equation (3.5) and 
the specific process data, it is possible to select the photodiode shape and 
size that will provide the highest outcome. 

The functional analysis of equation (3.5) enables the determination of the 
variables (area A and perimeter P) corresponding to the maximum value of 
the argument. It should be noted that area and perimeter are not independent; 
the function describing their dependence is implicit and must be taken into 
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Figure 3-9. A comparison of the dependence of the modeled and experimental pixel output 
signal on the photodiode area for the square-shaped pixel data set of Figure 3-8. 

consideration. In the case of a symmetrical photodiode shape, the function 
connecting area and perimeter can be easily obtained, equation (3.5) thus 
reduces to the case of only one independent variable; for example, the 
expression P = 4 A 112 relates area and perimeter for the square-shaped 
photodiode. This result is confirmed in Figure 3-9 , which presents a 2-D 
comparison of the dependence of the modeled and the experimental output 
signal on the photodiode area for the square-shaped pixel data set of Figure 
3-8. A more complicated photodiode shape could always be represented as 
an aggregate of the elementary symmetrical parts [17, 24] and investigated 
in order to obtain a relation with only one independent variable as above. 

Good correspondence of the model with the data is seen when the 
maximum divergence of the model result is constrained to 5%. It should be 
noted that surface leakage and the non-ideal transmission rate of the 
overlayers are not included in the present analysis and are considered to be 
second order effects [25]. 
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3.4 CMOS APS pixel photoresponse prediction 
for scalable CMOS technologies 

This section describes a semi-analytical diffusion-limited CMOS APS 
pixel photoresponse model use for maximum pixel photosignal prediction in 
scalable CMOS technologies. 

3.4.1 Impact of technology scaling on sensitivity 

Over the last twenty years, the evolution of CMOS technology has 
followed Moore’s Law: a new generation of technology has been developed 
every three years, and between generations, memory capacity increased by a 
factor of four and logic circuit density increased by a factor of between two 
and three. Furthermore, every six years (two generations), the feature size 
decreased by a factor of two and transistor density, clock rate, chip area, chip 
power dissipation and the maximum number of pins has doubled. This 
continuous development has led to a reduction in the state-of-the-art 
commercially available minimum lithographic feature size from 3 pm in 
1977 to 0.25 pm in 1998 and 0.1 pm in 2003. It is anticipated that it is 
technically possible to maintain the pace shown in Table 3-1 [26]. 

CMOS process development is funded by high-volume sales of standard 
CMOS logic and memory chips. Hence, CMOS imaging technology does 
not have to bear the process development costs and consequently has 
cheaper process costs than CCD imagers. 

Decreases in pixel size much beyond 5><5 pm have not been considered 
to be of much interest due to camera lens diffraction issues. However, in a 
common Bayer patterned RGB color sensor, 2x2 pixels define an effective 
color pixel; further downscaling of single pixels may prove useful for fitting 
an entire effective color pixel within the optical lens resolution limit. 
Therefore, pixel sizes below 5x5 pm 2 can be expected for all imagers, 
contrary to an assumption made by Wong [27]. 

The predicted downscaling of the supply voltage (see Table 3-1) will 
lead to a reduction of usable signal range and, together with increased noise 
components, will limit the overall dynamic range. Sources of detrimentally 
high leakage currents that generate shot noise and flood the pixel are 
discussed by Wong [27]. Examples include dark current, p-n junction 
tunneling current, transistor off current, gate oxide leakage, and hot carrier 
effects. 

As Wong [27] noted, the diffusion collection behavior is modified by 
increasing doping levels through the reduction of mobility and lifetime. 
Under strong absorption conditions, the absorption coefficient a(/ 1) and the 
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Table 3-1. Technology parameters for different process generations (after [26]). 



Process 


0.8pm 


0.5pm 


0.35pm 


0.25 pm 


0.18pm 


0.1pm 


Supply voltage (V) 


5-3.3 


5-3.3 


5-3.3 


2. 5-1. 8 


1.8-1. 5 


1. 2-0.9 


Interconnect levels 


2 


2-3 


4-5 


5-6 


6-7 


8-9 


Substrate doping (cm -3 ) 


8xl0 16 


1.2xl0 17 


2.5xl0 17 


3.4xl0 17 


5xl0 17 


lxio 18 


Junction depth d/S (nm) 


350-450 


300-400 


200-300 


50-100 


36-72 


20-40 


Depletion region (pm) 


0.71 


0.57 


0.39 


0.24 


0.19 


0.1 


Mobility (cm 2 -V -1 -s -1 ) 


825 


715 


550 


485 


425 


345 


Lifetime (ps) 


3.6 


2.3 


1.1 


0.8 


0.6 


0.3 


Diffusion length (pm) 


88 


68 


41 


33 


25 


15 



diffusion length L n are both much smaller than the width of the substrate. 
Further effects on spectral sensitivity result from the shallower drain/source 
diffusion and the shrinkage of the depletion widths. Figure 3-10 
demonstrates that the scaling effect on diffusion collection efficiency works 
in favor of CMOS imagers. The influence on mobility and lifetime are not 
very strong, so the loss of sensitivity in the infrared range is not very 
pronounced. On the other hand, the shallower diffusions promote the 
collection of blue light in the silicon material. This improves performance 
for visible light applications, especially for color sensors, as was shown 
above. These effects of technology scaling apply directly to the photodiode. 
However, the losses caused by the overlying layers must also be included 
and reduce the total QE considerably. 

The lens diffraction limit of 5 pm has been reached with the 0.35 pm 
process. The increased number of routing levels leads to an increasing 




Figure 3-10. Internal carrier collection efficiency for different process generations 
(after [26]). 
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Table 3-2. Scaling trends of CMOS sensor parameters for a standard photodiode (after [26]). 



Technology 


0.8 pm 


0.35 pm 


0.25 pm 


0.18 pm 


Sensor type 


standard 


standard 


standard 


standard 


dpix (ttm 2 ) 


14x14 


7x7 


5x5 


5x5 


Fill factor (%) 


60 


60 


60 


80 


T diode (gUl ) 


117.6 


29.4 


15 


20 


C diff (fF) 


5.75 


2.7 


1.87 


0.65 


Cgale (fF) 


2.06 


1.1 


0.67 


0.48 


Qiode (fF) 


94.4 


37.8 


38.3 


24.2 


C,„,(fF) 


102.2 


41.6 


40.8 


25.3 


q/C ( |dV/e ) 


1.5 


3.8 


3.9 


6.4 


S(V-nr'-cnr 2 ) 


1.8 


1.1 


0.57 


1.2 



thickness of the layer system on top of the diode, which aggravates the 
problems of absorption, reflection, diffraction, and even light guidance into 
distant pixels. In addition, the accuracy requirements for microlenses 
become very strict as the sensitive spot in the pixels becomes smaller and 
moves further away from the lens. Below 0.35 pm, only those processes that 
allow blocking of silicide layers on diffusions can be employed. The 
shrinkage of capacitances does not generally compensate sufficiently for the 
reduction of sensitive area. On the contrary, the area-specific capacitance 
increases due to increased doping concentrations, which automatically 
reduces the sensitivity. For a given fill factor, the overall sensitivity remains 
low, as shown by the results presented in Table 3-2 [26]. 

3.4.2 CMOS APS pixel photoresponse predictions 
for scalable CMOS technologies 

It has recently been shown ([28] and section 3.2) that for any pixel 
active-area shape, a reliable estimate of the degradation of image 
performance is possible. The tradeoff between conflicting factors (such as 
integrated photocarriers and conversion gain) can be compared for each 
pixel design, allowing the optimum overall sensor performance in a 
particular technology process to be determined. The present work is based 
on a thorough study of the experimental data acquired from several pixel 
chips fabricated in two different technology processes, 0.5 pm and 0.35 pm 
CMOS processes. Using this data, the analysis is extended to show the 
efficacy and the suitability of this photoresponse model in scalable CMOS 
processes. Calculations based on this model only require readily available 
process and design data to make possible the selection of designs with 
maximum output signal. 

Figure 3-3 and Figure 3-11 show two subsets of pixels (square and 
rectangular active-area shape, respectively) with decreasing photodiode 
dimensions and fabricated in a standard CMOS 0.5 pm process. 
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Figure 3-11. A sample from an otherwise identical set of rectangular active-area pixels 
(CMOS 0.5 pm technology) with decreasing photodiode dimensions. The photodiode areas 
in the full pixel set vary from 63 pm 2 to 13 pm 2 and their perimeters from 34 pm to 15.5 pm. 

Figure 3-4 and Figure 3-12 show the corresponding output curves for 
several wavelengths of illumination. These curves share the same behavior; 
i.e., Figure 3-4 curves display a pronounced maximum response location, 
while in Figure 3-12 the curves tend to an extremum. 

The photoresponse model enables the extraction of the unit “main area” 
and unit peripheral contributions to the output signal, and in turn the 
identification and modeling of pixel behavior (see Figure 3-8). 

The fact that the combination of these two contributions remains 
invariable for all pixels at a specific wavelength of illumination (for a 




Figure 3-12. Measured signal output obtained for the pixel set presented in Figure 3-11 as 
functions of the photodiode linear dimensions for two different wavelength illuminations. 
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Figure 3-13. A comparison of the modeled results and the measured data at two different 
wavelengths for the pixel set represented in Figure 3-11. Extrapolation of the model predicted 
a photodiode area and perimeter corresponding to a pixel design that would enable maximum 
photoresponse. Note that the optimal pixel was not actually designed and measured, but rather 
its existence and geometric dimensions were revealed by extrapolation of the model. 

specific process) enables the extrapolation of the modeled function and in 
turn the identification of the optimal photodiode geometry (for the 
rectangular pixel case, see Figure 3-13). Thus, the model theoretically 
predicts both the existence and the properties of the optimal geometry; (the 
optimal dimensions) based on the investigated process and the specific 
design data. 

The total “main area” and total periphery contributions to the output 
signal have been examined separately as a function of the change in the 
photodiode dimensions. With a increase in the dimensions, the “main area” 
contribution drops and the periphery contribution rises such that they 
intercept. The interception point is at the exact location where the maximum 
output signal was predicted by extrapolation in Figure 3-13 for the particular 
scalable CMOS process (see Figure 3-14). 

The overall influence of scaling on the device sensitivity is very 
complicated and depends on a large variety of parameters. An analytical 
expression that uniquely determines the general scaling trends has not yet 
been developed [26, 27]. An approximation is proposed to describe the 
scaling influence: it is assumed that the ratio between the unit “main area” 
and the unit “periphery” contributions has a slight upward trend, due 
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Figure 3-14. Total “main area” and periphery contributions to the output signal as a function 
of the change in the (rectangular) photodiode dimensions for the 0.5 pm CMOS pixel set 
presented in Figure 3-11. The interception point is at the exact location where the maximum 
output signal was predicted by extrapolation in Figure 3-13. 



mostly to the reduction of mobility and lifetime with increasing doping 
levels and to shrinkage of the depletion widths. Under strong absorption 
conditions where the diffusion coefficient a (A) and the diffusion length L n 
are both much smaller than the substrate width, the photocurrent density due 
to diffusion in the substrate can be derived [26] as J ph = q 0/(1 + l/a(A)L n ), 
where O is the photon flux entering the quasi-neutral p-substrate region and 
q is the electron charge. With technology downscaling, the unit “periphery” 
contribution to the output signal decreases. The depletion width shrinkage 
means that carriers are collected more through the bottom facet of the 
depletion region rather than through its lateral facets, intensifying therefore 
the relative “main area” contribution. In addition, the junction depth for 
advanced processes is small in comparison to the absorption depth, such that 
most photocarriers are collected through the bottom depletion facet. Based 
on the assumption above and using the process data and the extracted results 
(the coefficients k\ and k 2 , i.e., determining the unit “main area” and the unit 
“periphery” contributions for each particular wavelength), it is possible to 
determine the coefficients k\ and k 2 for the more advanced scalable CMOS 
technology. It is predicted that 
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Figure 3-15. A subset of a rectangular-shaped active area pixel set with decreasing 
photodiode dimensions (CMOS 0.35 pm technology). The photodiode areas vary from 
13.4 pm 2 down to 4.3 pm 2 and their perimeters vary from 15 pm down to 8.1 pm. 



and 



h / h 



h / k 0 



1 2 I CMOS 035 jum 1 2 \CMOS 0.5 /um 



<4 



CMOS 0.5 /um ^ ^ CMOS 0.35 jum 






/ T 

CMOS 0.35 /urn ' ^ n CMOS 0.5 ]um 



) « 1.07 



(3.7) 



where d is the depletion depth. 

An example calculation was performed for set of pixels of a rectangular 
photodiode shape designed according to the CMOS 0.35 pm rules and 
obeying the same mathematical guidelines as the older technology of pixels 
presented in Figure 3-12. An example subset of these newer pixels is shown 
in Figure 3-15. All the pixels share a common traditional three-transistor- 
type readout circuitry, enabling identification of the behavior of different 
pixel types. 

The predictions of equations (3.6) and (3.7) were used to find a pixel that 
enabled maximum response. In Figure 3-16 , the interception point of the 
calculated total “main area” and the total “periphery” contributions 
envisaged the maximum photoresponse pixel geometry for the pixel set 
designed and fabricated in the more advanced CMOS 0.35 pm technology 
design (shown in Figure 3-15). 

Figure 3-17 shows the comparison between the measured and 
theoretically modeled output curves for several wavelengths of illumination 
where obvious maximum response geometry was indicated. Note that the 
modeled function (based on available process and design data) reaches its 
maximum exactly at the point marked by the measurements. Moreover, the 
values obtained from the measurements of the contributions ratio are 
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1.13 
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Figure 3-16. Total “main area” and periphery contributions to the output signal as a function 
of the changes in the photodiode dimensions for the 0.35 pm CMOS pixel set presented in 
Figure 3-15. Note that this result is obtained theoretically, based only on the experimental 
results obtained from an older CMOS process and scaling considerations. 

and 



k x / k 2 



490 nm 

CMOS 0.35 jum 



1.068 



(3.9) 



These are similar to our theoretical results in equations (3.6) and (3.7). 
The maximum occurs exactly at the previously predicted interception point 
( Figure 3-16). 

The theoretical result described here was obtained for the 0.35 pm 
CMOS design based only on the parameters extracted from the available 
design and process data; i.e., there was no need for an actual study with a 
0.35 pm test chip. The optimum geometry, or the pixel that enabled the 
maximum photoresponse, was predicted theoretically based only on the 
experimental results obtained from older process data and scaling 
considerations. The measurements confirmed the theoretical prediction. 

This model for photoresponse estimation of a photodiode based CMOS 
APS field of applicability does not appear to be constrained by a specific 
technology process. It can therefore be used as a predictive tool for design 
optimization of each potential application in scalable CMOS technologies. 
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Figure 3-1 7. A comparison of the modeled and the measured results obtained for the pixels 
presented in Figure 3-15 at two different wavelengths of illumination. The geometry of the 
pixel that enabled maximum photoresponse is indicated. The maximum occurred exactly at 
the interception point indicated in Figure 3-16; i.e., the theoretically predicted result was 
confirmed by the measurements. 

3.5 Summary 

A closed-form semi-analytical expression has been presented for 
diffusion-limited CMOS APS pixel photosignal in a closely spaced 
photodiode array. This expression represents the pixel photosignal 
dependence on the pixel geometrical shape and fill factor, i.e., the 
photodiode active area and perimeter. It enables identification of the 
behavior of different pixel types and shows how changes in the device 
geometry affect its overall performance. In addition, the expression 
introduces a method for the determination and prediction of CMOS 
photodiode quantum efficiency based on the process and design data. 

The results indicate that the tradeoffs between conflicting factors (such as 
integration photocarriers and conversion gain) can be compared for each 
potential pixel design, and a reliable estimate for optimum overall sensor 
performance is possible. 

The model clearly makes possible the theoretical prediction of pixel 
designs that enable the extraction of maximum output signal for any selected 
photodiode shape. This prediction is based only on the usual process and 
design data available for different scalable CMOS technologies, and thus can 
be a practical tool for design optimization. 
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Abstract: Since active pixel sensors (APS) are fabricated in a commonly used CMOS 

process, image sensors with integrated “intelligence” can be designed. These 
sensors are very useful in many scientific, commercial and consumer 
applications. Current state-of-the-art CMOS imagers allow integration of all 
functions required for timing, exposure control, color processing, image 
enhancement, image compression, and ADC on the same die. In addition, 
CMOS imagers offer significant advantages and rival traditional charge 
coupled devices (CCDs) in terms of low power, low voltage and monolithic 
integration. This chapter presents different types of CMOS pixels and 
introduces the system-on-a-chip approach, showing examples of two “smart” 
APS imagers. The camera-on-a-chip approach is introduced, focusing on the 
advantages of CMOS sensors on CCDs. Different types of image sensors are 
described and their modes of operation briefly explained. Two examples of 
CMOS imagers are presented, a smart vision system-on-a-chip and a smart 
tracking sensor. The former is based on a photodiode APS with linear output 
over a wide dynamic range, made possible by random access to each pixel and 
by the insertion of additional circuitry into the pixels. The latter is a smart 
tracking sensor employing analog non-linear winner-take-all (WTA) selection. 

Keywords: CMOS image sensor, active pixel sensor (APS), charge-coupled devices 

(CCD), passive pixel (PS), system-on-a-chip, smart sensor, dynamic range 
(DR), winner-take-all (WTA) circuit. 

4.1 Introduction 

Driven by the demands of multimedia applications, image sensors have 
become a major category of high-volume semiconductor production. The 
introduction of imaging devices is imminent in consumer applications such 
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as cell phones, automobiles, computer-based video, smart toys and both still 
and video digital cameras. 

In addition to image capture, the electronics in a digital camera must 
handle analog-to-digital (ADC) conversion as well as a significant amount of 
digital processing for color imaging, image enhancement, compression 
control and interfacing. These functions are usually implemented with many 
chips fabricated in different process technologies. 

The continuous advances in CMOS technology for processors and 
DRAMs have made CMOS sensor arrays a viable alternative to the popular 
charge-coupled devices (CCD) sensor technology. New technologies provide 
the potential for integrating all imaging and processing functions onto a 
single chip, greatly reducing the cost, power consumption and size of the 
camera [1-3]. Standard CMOS mixed-signal technology allows the 
manufacture of monolithically integrated imaging devices: all the functions 
for timing, exposure control and ADC can be implemented on one piece of 
silicon, enabling the production of the so-called “camera-on-a-chip” [4]. 
Figure 4-1 is a diagram of a typical digital camera system, showing the 
difference between the building blocks of commonly used CCD cameras and 
the CMOS camera-on-a-chip. The traditional imaging pipeline functions — 
such as color processing, image enhancement and image compression — can 
also be integrated into the camera. This enables quick processing and 
exchanging of images. The unique features of CMOS digital cameras allow 
many new applications, including network teleconferencing, videophones, 
guidance and navigation, automotive imaging systems, and robotic and 
machine vision. 

Most digital cameras currently use CCDs to implement the image sensor. 
State-of-the-art CCD imagers are based on a mature technology and present 
excellent performance and image quality. They are still unsurpassed for high 
sensitivity and long exposure time, thanks to extremely low noise, high 
quantum efficiency and very high fill factors. Unfortunately, CCDs need 
specialized clock drivers that must provide clocking signals with relatively 
large amplitudes (up to 10 V) and well-defined shapes. Multiple supply and 
bias voltages at non-standard values (up to 15 V) are often necessary, 
resulting in very complex systems. 

Figure 4-2 is a block diagram of a widely used interline transfer CCD 
image sensor. In such sensors, incident photons are converted to charge, 
which is accumulated by the photodetectors during exposure time. In the 
subsequent readout time, the accumulated charge is sequentially transferred 
into the vertical and horizontal CCDs and then shifted to the chip-level 
output amplifier. However, the sequential readout of pixel charge limits the 
readout speed. Furthermore, CCDs are high-capacitance devices and during 
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Figure 4-1. Block diagram of a typical digital camera system. 

readout, all the capacitors are switched at the same time with high voltages; 
as a result, CCD image sensors usually consume a great deal of power. 
CCDs also cannot easily be integrated with CMOS circuits due to additional 
fabrication complexity and increased cost. Because it is very difficult to 
integrate all camera functions onto a single CCD chip, multiple chips must 
be used. A regular digital camera based on CCD image sensors is therefore 
burdened with high power consumption, large size and a relatively complex 
design; consequently, it is not well suited for portable imaging applications. 

Unlike CCD image sensors, CMOS imagers use digital memory style 
readout, using row decoders and column amplifiers. This readout overcomes 
many of the problems found with CCD image sensors: readout can be very 
fast, it can consume very little power, and random access of pixel values is 
possible so that selective readout of windows of interest is allowed. Figure 
4-3 shows the block diagram of a typical CMOS image sensor. 

The power consumption of the overall system can be reduced because 
many of the supporting external electronic components required by a CCD 
sensor can be fabricated directly inside a CMOS sensor. Low power 
consumption helps to reduce the temperature (or the temperature gradient) of 
both the sensor and the camera head, leading to improved performance. 

An additional advantage of CMOS imagers is that analog signal 
processing can be integrated onto the same substrate; this has already been 
demonstrated by some video camera-on-a-chip systems. Analog signal 
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Figure 4-2. Block diagram of a typical interline transfer CCD image sensor. 

processing can include widening the dynamic range of the sensor, real-time 
object tracking, edge detection, motion detection and image compression. 
These functions are usually performed by nonlinear analog circuits and can 
be implemented inside the pixels and in the periphery of the array. 
Offloading signal processing functions makes more memory and DSP 
processing time available for higher-level tasks, such as image segmentation 
or tasks unrelated to imaging. 

This chapter presents a variety of implementations of CMOS image 
sensors, focusing on two examples of system-on-a-chip design: an image 
sensor with wide dynamic range (DR) [5] and a tracking CMOS imager 
employing analog winner-take-all (WTA) selection [6]. 

4.2 CMOS image sensors 

CMOS pixels can be divided into two main groups, passive pixel sensors 
(PPS) and active pixel sensors (APS). 

Each individual pixel of a PPS array has only a photosensing element 
(usually a photodiode) and a switching MOSFET. The signal is detected 











APS Design: From Pixels to Systems 



103 




General 

Output 



Figure 4-3. Block diagram of a typical CMOS image sensor. 

either by an output amplifier implemented in each column or by a single 
output for the entire imaging device. These conventional MOS-array sensors 
operate like an analog DRAM, offering the advantage of random access to 
the individual pixels. They suffer from relatively poor noise performance 
and reduced sensitivity compared to state-of-the-art CCD sensors. 

APS arrays are novel image sensors that have amplifiers implemented in 
every pixel; this significantly improves the noise parameter. 

4.2.1 Passive Pixel Sensors 

The PPS consists of a photodiode and just one transistor (labeled TX in 
Figure 4-4). TX is used as a charge gate, switching the contents of the pixel 
to the charge integration amplifier (CIA). These passive pixel CMOS 
sensors operate like analog DRAMs, as shown in Figure 4-5. 

More modern PPS implementations use a CIA for each column in the 
array, as shown in Figure 4-6. The CIA readout circuit is located at the 
bottom of each column bus (to keep the voltage on that bus constant) and 
uses just one addressing transistor. The voltage V ref is used to reset the 
photo-diode to reverse bias. Following the reset, this switch is opened, for a 
period of integration time (r int ). During this period, the photodiode 
discharges at a rate approximately proportional to the amount of incident 
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Figure 4-4. Passive pixel sensor structure. 



illumination. When the MOS switch is closed again to reset the photodiode 
once more, a current flows via the resistance and capacitance of the column 
bus due to the difference between V ref and the voltage on the diode (Fdiode)- 
The total charge that flows to reset the pixel is equal to that discharged 
during the integration period. This charge is integrated on the capacitor C int 
and output as a voltage. When the final bus and diode voltages return to V ref 
via the charge amplifier, the address MOS switch is turned off, the voltage 
across C int is removed by the Reset transistor, and the integration process 
starts again. 

The passive pixel structure has major problems due to its large capacitive 
loads. Since the large bus is directly connected to each pixel during readout, 
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Figure 4-5. Basic PPS architecture. 






APS Design: From Pixels to Systems 



105 




Figure 4-6. PPS implementation with a separate CIA for each column in the array (after [13]). 

the RC time constant is very high and the readout is slow. In addition, 
passive pixel readout noise is typically high — on the order of 250 electrons 
rms compared to less than 10 electrons rms for commercial CCDs. Because 
of these factors, PPS does not scale well to larger array sizes or faster pixel 
readout rates. Furthermore, differences between the individual amplifiers at 
the bottoms of the different columns will cause fixed pattern noise (FPN). 
FPN is time-independent and arises from component mismatch due to 
variations in lithography, doping and other manufacturing processes. 

PPS also offers advantages. For a given pixel size, it has the highest 
design fill factor (the ratio of the light sensitive area of a pixel to its total 
area), since each pixel has only one transistor. In addition, its quantum 
efficiency (QE) — the ratio between the number of generated electrons and 
the number of impinging photons — can be quite high due to this large fill 
factor. 

4.2.2 Active Pixel Sensors 

The passive pixel sensor was introduced by Weckler in 1967 [7]. The 
problems of PPS were recognized, and consequently a sensor with an active 
amplifier (a source follower transistor) within each pixel was proposed [8]. 
The current term for this technology, active pixel sensor, was first 
introduced by Fossum in 1992 [1]. Figure 4-7 shows the general architecture 
of an APS array and the principal pixel structure. A detailed description of 
the readout procedure will be presented later in this chapter. 

Active pixels typically have a fill factor of only 50-70%, which reduces 
the photon-generated signal. However, the reduced capacitance in each pixel 
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Figure 4-7. General architecture of an APS array. 

leads to lower read noise for the array, which increases both the dynamic 
range and the signal-to-noise ratio (SNR). 

The pixels used in these sensors can be divided into three types: 
photodiodes, photogates, and pinned photodiodes. The most popular is 
currently the photodiode. 

4.2.3 Photodiode APS 

The photodiode APS was described by Noble in 1968 [8] and has been 
under investigation by Andoh since the late 1980s [9]. A novel technique for 
random access and electronic shuttering with this type of pixel was proposed 
by Yadid-Pecht in the early 1990s [12]. 

The basic photodiode APS employs a photodiode and a readout circuit of 
three transistors: a photodiode reset transistor ( Reset ), a row select transistor 
(RS) and a source-follower transistor (SF). The scheme of this pixel is shown 
in Figure 4-8. 
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Figure 4-8. Basic photodiode APS pixel. 



The charge-to-voltage conversion occurs at the sense node capacitance, 
which comprises the photodiode capacitance and all other parasitic 
capacitances connected to that node. In this case, these are the source 
capacitance of the Reset transistor and the gate capacitance of the SF 
transistor. The SF transistor acts as a buffer amplifier to isolate the sensing 
node; the load of this buffer (the active-current-source load) is located on 
each column rather than on each pixel to keep the fill factor high and to 
reduce pixel-to-pixel variations. The Reset transistor controls an integration 
time and is usually implemented with an NMOS transistor. Since no 
additional well is required for NMOS implementation, this allows a higher 
fill factor. However, an NMOS transistor with K DD on both gate and drain 
can only reach a source voltage (at the photodiode node) of K D d - Vt, 
thereby decreasing the dynamic range of the pixel. An example of a mask 
layout for this pixel architecture is shown in Figure 4-9. 

Photodiode APS operation and readout are described here with reference 
to both Figure 4-7 and Figure 4-8. Generally, pixel operation can be divided 
into two main stages, reset and phototransduction. 

(a) The reset stage. During this stage, the photodiode capacitance is 
charged to a reset voltage by turning on the Reset transistor. This reset 
voltage is read out to one of sample-and-hold (S/H) in a correlated double 
sampling (CDS) circuit. The CDS circuit, usually located at the bottom of 
each column, subtracts the signal pixel value from the reset value. Its main 
purpose is to eliminate fixed pattern noise caused by random variations in 
the threshold voltage of the reset and pixel amplifier transistors, variations in 
the photodetector geometry and variations in the dark current. In addition, it 
should eliminate the 1 If noise in the circuit. 

(b) The phototransduction stage. During this stage, the photodiode 
capacitor is discharged through a constant integration time at a rate 
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Figure 4-9. An example of a pixel layout with an L-shaped active area, which is the most 
common pixel design. 

approximately proportional to the incident illumination. Therefore, a bright 
pixel produces a low analog signal voltage and a background pixel gives a 
high signal voltage. This voltage is read out to the second S/H of the CDS by 
enabling the row select transistor of the pixel. The CDS outputs the 
difference between the reset voltage level and the photovoltage level. 

Because the readout of all pixels cannot be performed in parallel, a 
rolling readout technique is applied. All the pixels in each row are reset and 
read out in parallel, but the different rows are processed sequentially. Figure 
4-10 shows the time dependence of the rolling readout principle. A given 
row is accessed only once during the frame time (T frame ). The actual pixel 
operation sequence is in three steps: the accumulated signal value of the 
previous frame is read out, the pixel is reset, and the reset value is read out to 
the CDS. Thus, the CDS circuit actually subtracts the signal pixel value from 
the reset value of the next frame. Because CDS is not truly correlated 
without frame memory, the read noise is limited by the reset noise on the 
photodiode. After the signals and resets of all pixels in the row are read out 
to S/H, the outputs of all CDS circuits are sequentially read out using X- 
addressing circuitry, as shown in Figure 4-7. 

The output photodiode signal is supposedly independent of detector size, 
because the lower pixel capacitance of smaller detectors causes an increase 
in conversion gain that compensates for the decrease in detector size. 
However, peripheral capacitances from the perimeters of the detector 
increase the total capacitance of the sensing node and thus decrease the 
conversion gain. As the pixel size scales down, photosensitivity decreases 
and the reset noise scales as C 1/2 , where C is the photodiode capacitance. 
These tradeoffs must be considered when designing pixel fill factor, DR, 
SNR and conversion gain. 
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Figure 4-10. Rolling readout principle of the photodiode APS. 

4.2.4 Photogate APS 

Figure 4-11 shows the common photogate pixel architecture [10]. The 
basic concept for the photogate pixel arose from CCD technology. While 
photon-generated charge is integrated under a photogate with a high 
potential well, the output floating node is reset and the corresponding 
voltage is read out to the S/H of the CDS. When the integration is 
completed, the charge is transferred to the output floating node by pulsing 
the signal on the photogate. Then the corresponding voltage from the 
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Figure 4-11. Basic photogate pixel architecture. 
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Figure 4-12. Example layout of a photogate pixel design. 

integrated charge is read by the source follower to the second S/H of the 
CDS. The CDS outputs the difference between the reset voltage level and 
the photo-voltage level. 

As mentioned above, the CDS can suppress reset noise, 1 If noise and 
FPN due to V T and lithographic variations in the array. The reduction of 
noise level increases the total dynamic range and the SNR. The primary 
noise source for the photogate APS is photon shot noise, which cannot be 
suppressed by any means. 

The photogate has a pixel pitch typically equal to 20 times the minimum 
size of the technology, since there are five transistors in each pixel. Due to 
the overlaying polysilicon, however, there is a reduction in QE, particularly 
in the blue region of the spectrum. The photogate pixel architecture is shown 
as a mask layout in Figure 4-12. 

4.2.5 Pinned photodiode APS 

The pinned photodiode pixel consists of a pinned diode (p + -n-p), where 
the photon collection area is dragged away from the surface in order to 
reduce surface defect noise (such as that due to dark current) [13]. Photon- 
generated charge is integrated under a pinned diode and transferred to the 
output floating diffusion for the readout. As in the photogate APS, the sense 
node and integration node are separated to minimize noise. However, the 
primary difference is that the potential well for charge collection in a pinned 
diode is generated by a buried intrinsic layer (or an n-type layer) instead of a 
pulsed gate voltage as in the photogate. Each pinned diode pixel has four 
transistors and five control lines, resulting in a fill factor higher than in the 
photogate pixel, but lower than in the photodiode pixel. However, the small 
photon collection area of the pinned diode results in a very small full well 
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Figure 4-13. Basic pinned photodiode pixel architecture. 

for photon-generated charge collection and lower QE compared to the 
photodiode pixel. Figure 4-13 shows a basic pinned photodiode pixel 
structure. 

4.2.6 Logarithmic Photodiode APS 

Another type of APS is a logarithmic photodiode sensor [11]. This three- 
transistor pixel enables logarithmic encoding of the photocurrent, thus 
increasing the dynamic range of the sensor; i.e., the same range of sensor 
output voltage is suitable for a wider range of illumination. The 
implementation of the basic logarithmic photodiode pixel is described in 
Figure 4-14. 

This pixel does not require reset and operates continuously. The voltage 
on the photodiode (K PH ) is approximately equal to Fdd, causing the load 
transistor to operate in the subthreshold region (T DD = V PU + AK PH ). The 
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Figure 4-14. Basic logarithmic photodiode pixel architecture. 
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Figure 4-15. Architecture of a snapshot photodiode APS (after [12]). 



photocurrent (/ PH ) is equal to the subthreshold current (/ DS ). The voltage on 
the photodiode decreases logarithmically with linear increases in 
illumination (and thus photocurrent) following the equation 



V -V -AV =V - 

'PH *DD ^'PH f DD 



KT 



•In 



v'o J 



(4.1) 



where KT/q is 0.026 V at T=300K and 7 0 represents all constant terms. 
While the logarithmic APS has the above-mentioned advantages, it suffers 
from serious drawbacks such as significant temperature dependence of the 
output, low swing of the output (especially for relatively low illumination 
levels) and high FPN. Accordingly, the logarithmic APS is not commonly 
used. 



4.2.7 Snapshot pixels 

Most CMOS imagers feature the rolling shutter readout method (also 
known as the rolling readout method) shown in Figure 4-10. In the rolling 
shutter approach, the start and end of the light collection for each row is 
slightly delayed from the previous row; this leads to image deformity when 
there is relative motion between the imager and the scene. The ideal solution 
for imaging objects moving at high speed is the snapshot imager, which 
employs the electronic global shutter method [12]. This technique uses a 
memory element inside each pixel and provides capabilities similar to a 
mechanical shutter: it allows simultaneous integration of the entire pixel 
array and then stops the exposure while the image data is read out. The 
principal scheme of snapshot photodiode APS was introduced by Yadid- 
Pecht in 1991, and is shown in Figure 4-15. 
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Figure 4-16. Transistor scheme of a commonly used snapshot pixel. 

The snapshot pixel includes a sample-and-hold (S/H) switch with analog 
storage, which consists of all parasitic capacitances in the amplifier input. 
The in-pixel amplification is performed by a source follower amplifier, 
identical to that in a rolling shutter pixel. The full transistor scheme of a 
commonly used global shutter pixel is shown in Figure 4-16. 

In contrast to the rolling shutter technique, a sensor with global shutter 
architecture exposes all its pixels at the same time. After the integration time 
r int , the signal charge is stored in an in-pixel sample-and-hold capacitance 
until readout. One of the problems that should be addressed in the snapshot 
pixels is the shutter efficiency. The light exposure of the S/H stage, shutter 
leakage and the limited storage capacitance lead to signal lost. Figure 4-16 
shows a pixel that employs an NMOS transistor as a shutter. This 
implementation allows a small pixel area, but it has a low shutter efficiency. 
Shutter efficiency can be increased using a PMOS transistor as a shutter, if it 
is well separated from the photodiode. Unfortunately, a PMOS shutter 
decreases the fill factor and, due to increased parasitic capacitances, also 
decreases the conversion gain. 

4.3 APS system-on-a-chip approach 

CMOS image sensors allow implementations of complex sensing 
systems on a single silicon die. For example, almost all CMOS imagers 
employ analog to digital conversion on the same die. There are three general 
approaches to implementing ADC with active pixel sensors: 

7. Chip-level ADC , where a single ADC circuit serves the whole APS 
array. This method requires a very high-speed ADC, especially if a 
very large array is used. 

2. Column-level ADC , where an array of ADCs is placed at the bottom 
of the APS array and each ADC is dedicated to one or more columns 
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of the APS array. All these ADCs are operated in parallel, so a low- 
to-medium-speed ADC design can be used, depending on the APS 
array size. The disadvantages of this approach are the necessity of 
fitting each ADC within the pixel pitch (i.e., the column width) and 
the possible problems of mismatch among the converters on different 
columns. 

3. Pixel-level ADC , where every pixel has its own converter. This 
approach allows parallel operation of all ADCs in the APS array, so a 
very low speed ADC is suitable. Using one ADC per pixel has 
additional advantages, such as higher SNR, lower power and simpler 
design. 

In this section, two examples of CMOS imagers are described. The first 
implements a “smart vision” system-on-a-chip based on a photodiode APS 
with linear output over a wide dynamic range. An increase in the dynamic 
range of the sensors is enabled by random access to the pixel array and by 
the insertion of additional circuitry within each pixel. The second example is 
a smart tracking sensor employing analog nonlinear winner-take-all 
selection. 

In the design of a “smart” sensing system, an important step is to decide 
whether computation circuitry should be inserted within the pixel or placed 
in the periphery of the array. When processing circuitry is put within the 
pixel, additional functions can be implemented, simple 2-D processing is 
possible, and neighboring pixels can be easily shared in neural networks. 
These systems are also very useful for real-time applications. On the other 
hand, the fill factor is drastically reduced, making these systems unsuitable 
for applications where high spatial resolution and very high image quality 
are required. In all systems presented later in this chapter, most of the 
processing circuitry is placed in the periphery of the array to avoid 
degradation of image quality. 

4.3.1 Autoscaling APS with customized increase of dynamic range 

This section introduces the reader to the dynamic range problem in 
CMOS imagers, showing possible existing solutions. Then an advanced 
autoscaling CMOS APS with customized linear increase of DR is explained. 

4.3. 1.1 Dynamic range problem and possible solutions 

Scenes imaged with electronic cameras can have a wide range of 
illumination. Levels can range from 10” 3 lux for night vision to 10 5 lux for 
scenes illuminated with bright sunlight, and even higher levels can occur 
with the direct viewing of light sources such as oncoming headlights. The 
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intrascene dynamic range capability of a sensor is measured as 

DR = 20 log(y N ) (4.2) 

where S is the saturation level and N is the root mean square (rms) read noise 
floor measured in electrons or volts. The human eye has a dynamic range of 
about 90 dB and camera film of about 80 dB, but typical CCDs and CMOS 
APS have a dynamic range of only 65-75 dB. Generally, dynamic range can 
be increased in two ways: the first one is noise reduction and thus expanding 
the dynamic range toward darker scenes. The second method is incident light 
saturation level expansion, thus improving the dynamic range toward 
brighter scenes. 

Bright scenes and wide variations in intrascene illumination can arise in 
many situations: driving at night, photographing people in front of a 
window, observing an aircraft landing at night, and imaging objects for 
studies in meteorology or astronomy. Various solutions have been proposed 
in both CCD and CMOS technologies to cope with this problem [13-37]. 
Methods for widening the dynamic range can be grouped into five areas: 

7. Companding sensors, such as logarithmic compressed-response 
photodetectors; 

2. Multi-mode sensors, where operation modes are changed; 

3. Frequency-based sensors, where the sensor output is converted to 
pulse frequency; 

4. Sensors with external control over integration time , which can be 
further divided into global control (where the integration time of the 
whole sensor can be controlled) and local control (where different 
areas within the sensor can have different exposure times); and 

5. Sensors with autonomous control over integration time, in which the 
sensor itself provides the means for different integration times. 

Companding sensors. The simplest solution to increase DR is to 
compress the response curve using a logarithmic photodiode sensor, as 
described in section 4.2.6. Another type of companding sensor for widening 
the DR was introduced by Mead [16], where a parasitic vertical bipolar 
transistor with a logarithmic response was used. The logarithmic function 
there is again a result of the subthreshold operation of the diode-connected 
MOS transistors, added in series to the bipolar transistor. The voltage output 
of this photoreceptor is logarithmic over four or five orders of magnitude of 
incoming light intensity. It has been used successfully by Mahowald and 
Boahen [17]. This detector operates in the subthreshold region and has a low 
output voltage swing. A disadvantage of these pixels is that this form of 
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Figure 4-17. A pulse photosensor with reset circuitry (after Yang). 



compression leads to low contrast and loss of details; adaptation, where 
linearity around the operation point is exploited, was proposed to alleviate 
this problem [19-20]. In addition, the response of the logarithmic pixels with 
this kind of readout is light dependent. This means that for low light 
intensities the readout time would be very slow, depending also on the 
photodiode capacitance. 

Multimode sensors. A multisensitivity photodetector was proposed by 
Ward et al. [21]. The detector is a parasitic vertical bipolar transistor 
between diffusion, well and substrate. By connecting a MOS transistor to the 
base and the emitter of the bipolar transistor in a Darlington structure, the 
current gain can be boosted further. Thus, both bipolar transistors can be 
activated at very low light intensities and inactivated at higher intensities. 
For moderate levels, only one bipolar transistor is activated. Two selection 
transistors are required within the pixel for choosing the mode. This pixel 
occupies a relatively large area. 

Frequency-based sensors. In 1994, Yang [22] proposed a pulse 
photosensor that uses simple integrate-and-reset circuitry to directly convert 
optical energy into a pulse frequency output. The output of this photosensor 
can vary over 5-6 orders of magnitude and is linearly proportional to optical 
energy. A diagram of the circuitry is presented in Figure 4-1 7. 

The pixel fill factor is much decreased with this approach, since the inverter 
chain resides next to the photodiode. In addition, the pulse timing relies on 
the threshold voltages of the inverters. Since threshold voltage mismatches 
exists between different transistors, there will be a different response for 
each pixel. This makes this sensor worse in terms of noise, since the 
threshold mismatch translates to a multiplicative error (the output frequency 
of the pulses is affected) and not just constant FPN. 

Sensors with external control over integration time. A multiple- 
integration-time photoreceptor has been developed at Carnegie Mellon 
University [23]. It has multiple integration periods, which are chosen 




APS Design: From Pixels to Systems 



117 



depending upon light intensity to avoid saturation. When the charge level 
nears saturation, the integration is stopped at one of these integration periods 
and the integration time is recorded. This sensor has a very low fill factor. 

Sensors with autonomous control over integration time. The automatic 
wide-dynamic-range sensor was proposed by Yadid-Pecht [24, 25]. This 
imager consisted of a two-dimensional array of sensors, with each sensor 
capable of being exposed for a different length of time with autonomous on- 
chip control. Reset enable pulses are generated at specific times during the 
integration period. At each reset enable point, a nondestructive readout is 
performed on the sensor and compared to a threshold value. A conditional 
reset pulse is generated if the sensor value exceeds the threshold voltage. 
The main drawback with this solution is the extent of the additional 
circuitry, which affects spatial resolution. 

In the following sections, we describe an APS with an in-pixel 
autoexposure and a wide-dynamic-range linear output. Only a minimum 
additional area above the basic APS transistors is required within the pixel, 
and the dynamic range enhancement is achieved with minimal effects on the 
temporal resolution. 

4.3. 1.2 System architecture 

The architecture of the DR approach is shown in Figure 4-18. As in a 
traditional rolling-shutter APS, this imager is constructed of a two- 
dimensional pixel array, here of 64 columns and 64 rows, with random pixel 
access capability. Each pixel contains an optical sensor to receive light, a 
reset input and an electrical output representing the illumination received. 
The pixel used here is not a classic pixel, since it enables individual pixel 
reset via an additional transistor [5]. The outputs of a selected row are read 
through the column-parallel signal chain, and at certain points in time are 
also compared with an appropriate threshold in the comparison circuits. If a 
pixel value exceeds the threshold, a reset is given at that time to that pixel. 
The binary information concerning the reset (i.e., applied or not) is saved in 
digital storage for the later calculation of the scaling factor. The pixel value 
can then be determined as a floating-point number, where the exponent 
comes from the scaling factor for the actual integration time and the 
mantissa from the regular A/D output. Therefore, the actual pixel value 
would be 

Value = Man ■ (t/(t/X exp )) = Man ■ X EXP (4.3) 



where Value is the actual pixel value, Man (mantissa) is the analog or 
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Figure 4-18. General architecture description of the autoscaling CMOS APS. 

digitized output value that has been read out at the time T, X is a constant 
greater than one (for example, 2), and EXP is the exponent value describing 
the scaling factor. This digital value is read out at the upper part of the chip. 
For each pixel, only the last readouts of a certain number of rows are kept to 
enable the correct output for the exponent bits. 

The idea of having a floating-point presentation per pixel via real-time 
feedback from the pixel has been proposed before by Yadid-Pecht [36]. 
However, the previous design required an area in the pixel that substantially 
affected the fill factor, so it was then proposed that the control for a group of 
pixels should be shared. In the currently proposed solution, however, the 
necessary additional hardware will be placed in the periphery; as a result, the 
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Figure 4-19. Combined time-space diagram. 

information can be output with minimal effect on spatial or temporal 
resolution. The spatial resolution is slightly modified, since the desired 
ability to independently reset each pixel requires an additional transistor per 
pixel [37]. The temporal resolution should be assessed: a pixel should be 
checked at different time points to get the exponential (scaling) term. 
Equivalently, the pixels of different rows could be checked to get the same 
information. In the latter case, row n at time zero would provide the mantissa 
for row n (either through an on-chip or an off-chip A/D output), while the 
pixels in row n - N/2 (where N is the total number of rows that set the frame 
time) would provide the first exponent bit (Wi) as a result of the logic circuit 
decision for that row. Row n-N/4 would provide the second bit (W 2 ) for 
that row, row n - N / 8 would provide the third bit, and so on. Thus, at the 
cost of a customized number of comparisons, the required information can 
be obtained automatically and the mantissa scaled accordingly. 

Figure 4-19 describes this approach via a combined time-space diagram 
where the axes represent the row number and time, respectively. W h W 2 , ... 
W W9 represent the exponent bits; i.e., W\ represents the first point of decision 
T - 772 (whether to reset or not for the first time), W 2 for the next point and 
so forth. The equivalent is shown at point n - N/2 in the spatial domain, 
which is the row that should be used for the decision concerning W\. 
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For example, an imaging device consisting of a square pixel array of 64 
rows and 64 columns may be assumed, and it is desired to expand the 
dynamic range by 3 bits due to a high illumination level. Therefore, W= 3 
and TV =64 in this case. For each pixel from the selected row n , three 
comparisons (W u W 2 and W 3 ) are carried out at three corresponding time 
points (T- 772, T - 774, T- 778). The first comparison, which is with the 
threshold, is carried out at row n- 32 (32 rows before regular readout of that 
pixel). This leaves an integration time of 772 with a comparison result of 
W\ = “1”, and this pixel is reset. The second comparison is carried out at row 
n - 16 (16 rows before regular readout of that pixel), leaving an integration 
time of 774 with a comparison result of W 2 = “1”; this pixel is also reset. The 
third comparison is carried out at row n - 8 (8 rows before regular readout of 
that pixel), leaving an integration time of 778 with a comparison result of 
W 3 = “1”. This pixel is reset as well. The autoscaling combination for the 
pixel in this example is therefore (1 1 1): this means that the pixel has been 
reset three times during the frame time, and the regular readout for this pixel 
should be scaled (multiplied) by a factor of 2 3 = 8. 

The frame time of a pixel array consisting of TV rows and TV columns may 
be calculated. Readout time for each row is composed of a copying time 
(Tcopy = the time needed to copy one row into the readout buffer) and a 
scanning time (r scan = the time needed to scan each pixel). Since there are TV 
pixels in each row, the total time for row readout (r row ) is given by 

T =T + TV x T (4.4) 

row copy scan \ / 

and the frame time (r frame ) is given by 

T frame = ^X T ^ (4.5) 



By adding W comparisons (for W different integration times) for each row, 
the row readout time is slightly modified and is given by 



T' =WxT +T 

row comp row 

= WxT +T +NxT 

comp copy sc 



(4.6) 



where r comp is the time for comparison to the threshold level. Since W«N 
then 
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Figure 4-20. Block diagram of the proposed design. 



W xT «NxT (4.7) 

and r' row = T ww . Hence, the frame time r frame is insignificantly affected by 
autoscaling. This enables the imager to process scenes without degradation 
in the frame rate. 

4.3. 1.3 Design and implementation 

A block diagram of the proposed design is shown in Figure 4-20. The 
design makes use of a column parallel architecture to share the processing 
circuits among the pixels in a column. The pixel array, the memory array 
and the processing elements are separated in this architecture. Each pixel 
contains an additional transistor (in series with the row reset transistor) that 
is activated by a vertical column reset signal; this allows the independent 
reset of the pixel. Integration time can be adjusted for each pixel with this 
reset, and nondestructive readout of the pixel can be performed at any time 
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Figure 4-21. Electrical description of one column of an autoscaling CMOS APS. 

during the integration period by activating the row select transistor and 
reading the voltage on the column bus. 

The processing element contains the saturation detection circuit, which is 
shared by all pixels in a column. Because of the column parallel architecture, 
the pixel array contains a minimum amount of additional circuitry and 
sacrifices little in fill factor. The memory array contains the SRAMs and 
latches. Two horizontal decoders — one each for the pixel array and the 
memory array — work in parallel and are used to retrieve the mantissa and 
exponent, respectively. The vertical decoder is used to select the rows in 
order. 
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Figure 4-22. Photograph of the fabricated autoscaling CMOS APS test chip. 

The electrical scheme for a single column is presented in Figure 4-21. In 
this circuit, the pixel output signal is evaluated at the comparator, where it is 
compared with an appropriate threshold. If its signal exceeds a 
predetermined threshold, the pixel is detected as saturated. Using this 
information and the binary information concerning the pixel (stored in the 
memory during different parts of the integration), a decision whether to reset 
the pixel is made. If the decision is positive, the column reset ( CRST) and 
row reset (RRST) lines must both be precharged at a logical high voltage to 
activate the reset transistor; the photodiode then restarts integration. If the 
decision is negative, the reset is not active and the pixel continues to 
integrate. The binary information (whether the reset was applied or not) is 
saved in the SRAM memory storage and output to the latches in due time. 
After the row is read through the regular output chain, this additional 
information is retrieved from the memory through the latches. 

4.3. 1.4 Experimental results 

A 64 x 64 pixel chip was successfully fabricated using the HP 0.5 pm n- 
well process. The chip photograph is shown in Figure 4-22. The sensor was 
quantitatively tested for relative responsivity, conversion gain, saturation 
level, noise, dynamic range, dark current, and fixed pattern noise. The results 
are presented in Table 4-1. 

The conversion gain was in general agreement with the design estimate 
of photodiode capacitance. The saturation level was approximately 1.33 V; 
fixed pattern noise (FPN) was measured to be approximately 0.15% 
saturation; dark current was measured to be on the order of 30-35 mV/sec, 
output referred, or 0.61 pA/cm 2 ; and the inherent dynamic range was 
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Table 4-1. Attributes of the autoscaling CMOS APS test chip. 



Chip format 


64 x 64 pixels 


Chip technology 


HP 0.5 pm 


Chip size 


1.878 mm x 2.9073 mm 


Pixel size 


14.4 pm x 14.4 pm 


Pixel type 


Photodiode 


Pixel fill factor 


37% 


Conversion gain 


12 pV/e" 


Fixed pattern noise (FPN) 


0.15% 


Dark current (room temp) 


35 mV/sec (0.61 pA/cm 2 ) 


Power 


3.71 mW (5 MHz) 


Inherent dynamic range 


71.4 dB (~1 1 bit) 


Extended dynamic range 


2 additional bits 


Saturation level 


1.33 V 


Quantum efficiency (QE) 


20% 



71.4 dB, or 11 bits. The extended dynamic range provided two additional 
bits to the inherent dynamic range. No smear or blooming was observed due 
to the lateral overflow drain inherent in the APS design. The chip was also 
functionally tested. 

Figure 4-23 shows a comparison between an image captured by a 
traditional CMOS APS and by the autoexposure system presented here. In 
the Figure 4-23(a ), a scene is imaged with a strong light on the object; 
hence, some of the pixels are saturated. At the bottom of Figure 4-2 3(b), the 
capability of the autoexposure sensor for imaging the details of the saturated 
area in real time may be observed. Since the display device is limited to 
eight bits, only the most relevant eight-bit part (i.e., the mantissa) of the 
thirteen-bit range of each pixel is displayed here. The exponent value, which 
is different for different areas, is not displayed. This concept in its present 
form suits rolling- shutter sensors, and a first prototype following this 
concept has been demonstrated here. 

4.3.2 CMOS smart tracking sensor employing WTA selection 

This section presents an example of a smart APS sensor suitable for 
tracking purposes. The system employs an analog winner-take-all circuit to 
find and track the brightest object in the field of view (FOV). This system- 
on-a-chip employs adaptive spatial filtering of the processed image, with 
elimination of bad pixels and with reduction of false alarm when the object 
is missing. The circuit has a unique adaptive spatial filtering ability that 
allows the removal of the background from the image, and this occurs one 
stage before the image is transferred to the WTA detection circuit. A test 
chip of 64 x 64 array has been implemented in 0.5 pm CMOS technology. It 
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in) (b) 

Figure 4-23. (a) Scene observed with a traditional CMOS APS sensor.(b) Scene observed 
with the in-pixel autoexposure CMOS APS sensor. 

has a 49% fill factor, it is operated by 3.3 V supply, and it dissipates 36 mW 
at video rate. The system architecture and operation are described, together 
with measurements from a prototype chip. 

4 .3 .2 . 1 Motivation 

Many scientific, commercial and consumer applications require spatial 
acquisition and tracking of the brightest object of interest. A winner- take-all 
function has an important role in these kinds of systems: it selects and 
identifies the highest input (which corresponds to the brightest pixel of the 
sensor) and inhibits the rest. The result is a high digital value assigned to the 
winner pixel and a low one assigned to the others. CMOS implementations 
of WTA networks are an important class of circuits widely used in neural 
networks and pattern-recognition systems [42, 44]. Many WTA circuit 
implementations have been proposed in the literature [40-52]. A current- 
mode MOS implementation of the WTA function was first introduced by 
Lazzaro [40]. This very compact circuit optimizes power consumption and 
silicon area usage. It is asynchronous, responds in real time and processes all 
input currents in parallel. 

Most of the existing WTA circuits can be integrated with APS sensors. 
Usually, when WTA circuits are used in two-dimensional tracking and 
visual attention systems, the image processing circuitry is included in the 
pixel; however, there are penalties in fill factor or pixel size and resolution. 
Here we show an implementation of a 2-D tracking system using 1-D WTA 
circuits. All image processing is performed in periphery of the array without 
influencing imager quality. 
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Figure 4-24. Block diagram of the proposed CMOS smart tracking sensor system. 

As mentioned above, the regular WTA circuit chooses a winner from a 
group of input signals. When an APS with WTA selection system is used for 
object selection, a number of problems can occur. If the object of interest 
disappears from the processed image, the WTA will compare input voltages 
that represent intensity levels in the image background; hence, the circuit 
will output the coordinates of some background pixel instead of the missing 
object of interest and cause a false alarm. Background filtering and false 
alarm reduction is therefore necessary. Another problem that can disrupt 
proper operation of the system is a bad pixel. Since the bad pixel has a high 
value, it can be selected as the winner regardless of other pixel values. 

The simplest technique for background filtering is signal comparison 
against a globally set threshold, above which pixels qualify as object pixels. 
The disadvantage of this kind of filtering is that it is necessary to choose the 
value of this threshold in advance; in the case where the background is 
overly bright (i.e., above the chosen threshold), the circuit will not be able to 
cope with the task. The problem is most severe if the object of interest 
disappears from the processed image and the background is bright. 

The system described here is a 64 x 64 element APS array with two- 
dimensional WTA selection. A spatial adaptive filtering circuit allows 
adaptive background filtering and false alarm reduction if the object is 
missing from the scene. 
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4.3.2.2 System architecture 

Figure 4-24 shows the block diagram of the system. There is no penalty 
in spatial resolution using the proposed architecture, since the processing is 
done at the periphery of the array. 

The process of phototransduction in the APS is done row-by-row in two 
phases: reset (where a photodiode capacitor is charged to a high reset 
voltage) and phototransduction (where the capacitor is discharged after a 
constant integration time). A commonly used CDS circuit is implemented on 
the chip to subtract the signal pixel value from the reset one; the output is 
high for a bright pixel and low for a dark one. The subsequent stage is 
adaptive background filtering of all pixels for which the CDS values are less 
than a threshold value. This threshold corresponds to the average of the 
outputs from all row sensors, with the addition of a small variable epsilon 
value. Only the signals that pass the filter are transmitted to the WTA 
selection circuit. When there is no detectable object (i.e., only background 
signals exist), no signals pass this filter, and so the “no object” output 
becomes high and false alarm reduction is achieved. A detailed description 
of this filtering technique is presented in section 4. 3.2. 3. 

The next stage is the winner-take-all selection, which is done with a 
simple voltage-mode WTA after Donckers et al. [48]. The main factor for 
choosing this WTA circuit is its simplicity; generally any kind of voltage- or 
current-mode WTA can be integrated with this system [53]. The winner 
selection is done row-by-row. The winner of each row is found, its value is 
stored into an analog memory (a capacitor), and its address is deposited in 
the digital storage. If there is more than one input with the same high value, 
the WTA chooses the leftmost one using a simple logic. The result is a 
column of N analog pixel values of row winners with their column 
addresses. From all the row winners, a global winner is selected using an 
identical WTA circuit. The 1-D winner selection array was designed to 
consist of eight identical blocks of eight-input WTA cells to achieve better 
resolution and reduce matching problems. The row winner value is stored in 
the analog memory that corresponds to the actual row and its column address 
is stored in the corresponding digital memory; these analog and digital 
memories are in the ROW logic block displayed in Figure 4-24. In the case 
of “no object” in a row, the value transmitted to the memory is zero. 
Following a full frame scan, the WTA function is activated on all 64 row 
winners (in the ROW WTA block in Figure 4-24) and the location of the 
global frame winner is found. Its analog value and its address are then read 
out of the memory by an encoder (the ENC block in Figure 4-24). 

This architecture allows the enlargement of the proposed system to any 
size of A x TV pixel array without affecting the system properties. 
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Figure 4-25. Principal scheme of the spatial filter circuit in the CMOS smart tracking sensor. 

4.3.2.3 Descriptions of the circuits 

4.3.2.3.1 The adaptive spatial filter 

The principal scheme of the adaptive 1-D filter circuit used in the CMOS 
smart tracking sensor system is shown in Figure 4-25. The inputs to the filter 
correspond to the CDS values, and the outputs are the control signals. As 
mentioned earlier, this circuit filters all pixels for which the CDS values are 
less than a threshold value. This threshold corresponds to the average of the 
outputs from all the row sensors, with the addition of a small variable 
epsilon value. The Control output of the filter is high for an object pixel and 
low for a pixel that is imaging the background. The advantage of this 
filtering is that this epsilon value is adaptive and not constant for different 
input vectors. The value of epsilon depends inversely on the average current 
value: it increases when the average current decreases and decreases when 
current increases. This results in a small epsilon value for a high background 
and a high epsilon for a low background. The filtering process is thus more 
sensitive when a high background is present and the input voltage of the 
object is very similar to that of the background. The epsilon value can be 
controlled manually by setting suitable V- and V+ voltage values (see 
Figure 4-25). 

If V+ is zero and V- is V DD , then the epsilon value is zero and all 
Average + £ currents of the circuit are equal to the average current of the n 
inputs in an ^-sized array. The non-zero epsilon value can be added to this 
average by increasing the V+ value. The epsilon value can also be subtracted 
from the average by decreasing the V+ value. A positive epsilon value is 
usually of interest here, so V- is K DD in this case. Note that the voltages to 
the current converters have a pull configuration (the current direction is 
shown by arrows in Figure 4-25). 

The adaptive functionality can be achieved by operation of transistor N in 
the linear region. With an average current increase (reflecting an increase in 
background), the V gs values of the Pi...P* transistors are increased as well, 
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Figure 4-26. Principal scheme for bad pixel elimination in the CMOS smart tracking sensor. 

which causes a reduction in the V ds voltage of transistor N. The result is a fall 
in current at transistor N and a reduction of epsilon. Note that if a constant 
epsilon value is required, a stable current source (independent of the V sg of 
Pi . . .Pyt) can be used instead of transistor N. 

Instead of using a transistor N, another simple option is to connect two 
parallel transistors, one with a relatively large W/L value operating in 
saturation and another with a small W/L operating in the linear region. This 
arrangement can achieve a constant epsilon by cutting off the transistor that 
is usually operated in the linear region and using only the saturation 
transistor. Alternatively, an adaptive epsilon value can be achieved by using 
the transistor operating in the linear region and cutting off the transistor that 
is in saturation. 

In addition to background filtering, the circuit enables “no object” 
notification. If the control values of all pixels are “0”, the “no object” output 
is high and false alarms are reduced. However, noise levels vary with the 
background level and shot noise is higher at a higher background level. To 
achieve proper operation of the “no object” function, the minimum value of 
epsilon must therefore be limited. 
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Figure 4-27. Photograph of the CMOS smart tracking sensor test chip. 

The inputs to the WTA circuit depend on the filter control values: zero 
for a background pixel and pixel intensity level for an object pixel. 

4.3.2.3.2 Elimination of bad pixels 

Bad pixels can disrupt proper operation of the system. A bad pixel can 
have a high value, so it may be selected as the winner regardless of other 
pixel values. In the proposed CMOS smart tracking sensor system, bad 
pixels are disabled with a special “dark mode” in which a dark image is 
input. Figure 4-26 shows the principal scheme for bad pixel elimination. 

Bad pixels are eliminated in two stages. In the first stage (the dark 
mode), the darkjnode signal in Figure 4-26 is “1” and the system processes 
a dark image — a very low background without an object of interest. The 
circuit finds bad bright pixels, using a regular WTA algorithm as described 
before. The frame must be scanned (processed) N times in order to find N 
bad pixels. After each additional frame scanning, a new bad pixel is found 
and its address is stored in the memory (using the X_addr and Y_addr buses 
in Figure 4-26). 

In the second stage (regular system operation), the darkjnode signal is 
“0” and a real image is processed. For the “bad” pixels that were found in 
the dark mode stage, however, only ground values are transmitted to the 
filter and the WTA circuits. This is accomplished by comparing the address 
of every processed row ( injaddr in Figure 4-26) with the Yjiddr stored in 
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Figure 4-28. Layout of a single photodiode pixel of the CMOS smart tracking sensor. 

the memory. If their values are equal, the comparator will output “1” and the 
X_addr of the bad pixel of this row is transferred to the Dec. 6x64 block. In 
the array, the input bad pixel voltage is replaced with a ground value. The 
values of all bad pixels are replaced by ground, and thus bad pixels cannot 
be chosen as winners. 

The fabricated prototype chip allows the elimination of up to three bad 
pixels. Adding more logic can easily increase this capacity. 

43.2 A Performance and test chip measurements 

The CMOS smart tracking sensor system was designed and fabricated in 
a 0.5 pm, n-well, 3-metal, CMOS, HP technology process supported by 
MOSIS. The supply voltage was 3.3 V. A photograph of the test chip is 
shown in Figure 4-27 . The test chip includes an APS array, row decoders, 
correlated double sampling circuit, an adaptive spatial filter, eight identical 
eight-input voltage WTA circuits, logic, a global winner selection circuit and 
a bad pixel elimination block. 

The test chip was designed to allow separate modular testing of every 
functional block of the chip as well as measurements of the chip as a unit. 
The main separate blocks of the chip (the APS with CDS, the adaptive filter 
and the eight-input WTA circuit) were tested, as was the whole chip to check 
the functionality. 

4 .3.2.4. 1 APS with CDS 

The layout of a single photodiode pixel of the CMOS smart tracking 
sensor is shown in Figure 4-28. 

As mentioned before, there is no penalty in spatial resolution for this 
architecture since the processing is done at the periphery. The pixel size is 
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(a) 

Figure 4-29. Four images as captured by the APS: (a) the Ben-Gurion University logo on a 
high illumination background; (b) the VLSI Systems Center logo; (c) WTA with a laser beam 
spot; and (d) a laser beam spot on a dark background. 

14.4 |um x 14.4 |um and the fill factor is 49%. Figure 4-29 shows four 
images as captured by the sensor under different background illumination 
levels. 

4.3.2.4.2 Adaptive filter measurements 

Measurements were carried out at both low and high background levels 
to determine the properties of the filter. These measurements check the 
ability of the circuit to filter background regardless of its level and also 
check the dependence of the rvalue on the background level. Figure 4-30(a) 
and (b) show the response of the filter to low and high backgrounds, 
respectively. 

In the first test, the average of the input voltages for a low background 
was 820 mV; this is represented by the horizontal solid line in Figure 4- 
30(a). One of the filter inputs ranged from 0 V to 3.3 V; these inputs 
corresponded to the background in the case of a low value and the object in 
the case of a high value. They are represented by the sloped sawtooth lines 
in Figure 4-30. The pulsed voltage in Figure 4-30 is the filter control output 
(see Figure 4-25). This square wave is low for V in3 < 1.6 V and high for 
V in3 > 1.6 V. This value represents V average + e when V in3 = 1.6 V. In this case, 
the rvalue is 780 mV. As mentioned earlier, the rvalue can be changed for 
this input vector by changing the V+ and V- control voltages. 

The same procedure was performed to test the high background case, 
where the average was 1.54 V and the rvalue was found to be 360 mV. 

The filter responded as expected: a high epsilon value was generated for 
a low background level and a low epsilon value for a high background. 

Figure 4-31 plots the epsilon value as function of background levels for 
different V+ values. As expected, an increase in V+ bias at a constant 
background level gives a higher epsilon value. On the other hand, the epsilon 
value decreases with background increase. 
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Figure 4-30. Adaptive filter response for (a) a low background and (b) a high background. 



4.3.2.4.3 Measurements of the filter response and false alarm reduction 

Figure 4-32 shows the filter response to four different images. The 
column beside each image shows the “no object” outputs for every row: 
white corresponds to “1” in “no object” output and black to “0”. Figure 4- 
32(a) and Figure 4-3 2(b) present the same object of interest (a laser beam 
spot) but with two different background levels, a high background in Figure 
4-32(a) and a low background in Figure 4-32 (b). 

Because of the adaptive properties of the filter, the epsilon value is higher 
for the low backgrounds in Figure 4-3 2(b) and (d), so the filtering is more 
aggressive and only very high values pass the filter. For the high 
backgrounds in Figure 4-32 (a) and (c), the epsilon value is lower and 
relatively lower voltage values can pass the filter. The filtering process is 
thus more sensitive when a high background is present and the object input 
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Figure 4-31. Epsilon value as function of background levels for different V+ values. 
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(c) 



(d) 



Figure 4-32. Adaptive filter response to four different images: (a) laser beam with a high 
background; (b) laser beam with a low background; (c) high background only; and (d) low 
background only. 




Figure 4-33. Global winner selection: (a) the processed image; and (b) the winner location. 





APS Design: From Pixels to Systems 



135 



Table 4-2. Chip attributes for the CMOS smart tracking sensor. 



Technology 


HP 0.5 pm 


Voltage supply 


3.3 V 


Array size 


64 x 64 


Pitch width 


14.4 pm 


Fill factor 


49% 


WTA mode 


Voltage 


WTA resolution 


40 mV 


Chip size (mm) 


3.5 x 4.3 


Frame scanning frequency 


30 Hz 


Minimum power dissipation 


-28 mW 


(low background without object) 


Typical power dissipation 


-36 mW 


(the laser beam is -10% of the frame) 


FPN (APS) 


0.15% 


Conversion gain (APS) 


7.03 pV/e" 


Dark response (APS output) 


5.77 mV/s 


Dynamic range 


65 dB 


Output voltage range 


1.47 V 



voltage is close to the background level. However, there is more freedom in 
filtering when a low background is present. In both cases, the object of the 
interest passes the filter. With a high background (small epsilon value), 
however, some pixels of the background succeed in passing the filter, while 
for a low background (high epsilon value) only the object pixels pass the 
filter. This effect can be seen in Figure 4-32 (a) and (b), where the “no object 
outputs” flags show the number of rows that succeed to pass the filter. As 
expected, Figure 4-3 2 (a) has more “object” rows than Figure 4-32(b). 
Figure 4-3 2(c) and (d) examine the false alarm reduction for low and high 
backgrounds. In both cases, the object of interest is not in the processed 
image and thus no signal passes the filter. The “no object” output is fixed on 
“1” when the object of interest disappears from the processed image, and 
therefore a false alarm is prevented for both levels of background. 

4.3.2.4.4 The global winner measurements. 

To examine the ability of the system to find the coordinates of the global 
winner, a focused laser beam was used as object of interest. Figure 4-33 (a) 
shows the processed image; Figure 4-3 3(b) presents the winner as found by 
the system. As expected, the winner is the pixel farthest to the upper left in 
the first row of the object of interest. 

Table 4-2 summarizes the chip specifications for the CMOS smart 
tracking sensor. 
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4.4 Summary 

In this chapter, the area of CMOS imagers was briefly introduced. 
CMOS technologies, CCD technologies and different CMOS pixels were 
described and compared. The system-on-a-chip approach was presented, 
showing two design examples. The main advantages of CMOS imagers — 
low cost, low power requirements, fabrication in a standard CMOS process, 
low voltage and monolithic integration — rival those of traditional charge 
coupled devices. With the continuous progress of CMOS technologies, 
especially the decreasing minimum feature size, CMOS imagers are 
expected to penetrate into various fields such as machine vision, portable 
devices, security, biomedical and biometric areas, and other applications 
where custom sensors and smart pixels are required. More systems on chips 
will definitely be seen in the near future. 
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Abstract: In this chapter, three systems for imaging and visual information processing at 

the focal plane are described: current-mode, voltage-mode and mixed-mode 
image processing. It is demonstrated how spatiotemporal image processing 
can be implemented in the current and voltage modes. A computation-on- 
readout (COR) scheme is highlighted; this scheme maximizes pixel density 
but still allows multiple processed images to be produced in parallel. COR 
requires little additional area and access time compared to a simple imager, 
and the ratio of imager to processor area increases drastically with scaling to 
smaller-feature-size CMOS technologies. In some cases, it is necessary to 
perform computations in a pixel-parallel manner while still retaining the 
imaging density and low-noise properties of an APS imager. Hence, an imager 
that uses both current-mode and voltage-mode imaging and processing is 
presented. The mixed-mode approach has some limitations, however, and 
these are described in detail. Three case studies are used to show the relative 
merits of the different approaches for focal-plane analog image processing. 

Key words: Focal-plane analog image processing, current-mode image processing, 

voltage-mode image processing, spatiotemporal image filtering, time 
difference imaging, motion detection, centroid localization, CMOS imagers, 
analog VLSI. 

5.1 Introduction 

Integrating CMOS active pixel sensors (APS) [1-2] with carefully 
chosen signal processing units has become a trend in the design of camera- 
on-a-chip systems [3-4]. The benefits of low cost and low power from the 
emerging CMOS imaging technology have encouraged various research 
directions in creating image processing sensors. In the early 1990s, Mead 
initiated the concept of including processing capabilities at the focal plane of 
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imagers with his neuromorphic design paradigm [5]. Mead’s neuromorphic 
approach has inspired many other engineers to mimic the receptive fields 
(i.e., the spatiotemporal impulse responses) of human retinal cells. These 
cells perform convolution of the center surround kernel with the incident 
image [1-9]. Such retinomorphic systems are ideal for fast early processing. 
The inclusion of processing circuitry within the pixels, however, prevents 
the retinomorphic systems from acquiring high-resolution images. These 
systems are also optimized to realize specialized spatiotemporal processing 
kernels by hardwiring the kernels within the pixels, thus limiting their use in 
many algorithms. Boahen and Andreou introduced increased flexibility by 
using external biases to control the size of the hardwired convolution kernels 
[8]. Shi recently presented a specialized imager for computing various Gabor 
filtered images [10], in which the convolved images were used to detect 
object orientation. Although focal plane processing with programmable 
cellular networks [1-14] and switch capacitor networks [15] provide another 
direction for solving the programmability problem, they have resolution 
limitations similar to those of the retinomorphic approach. 

Digital focal-plane implementation is another approach to fast and high- 
precision image processing. This approach requires analog-to-digital 
conversion (ADC) and one or many digital processing units (typically a 
CPU/DSP core). In these architectures, multi-chip systems or complicated 
digital system-on-chip units are needed. Typically the imaging and ADC is 
performed on one chip, whereas the computation is performed in the digital 
domain on a second chip [16-17]. High power consumption, complex inter- 
chip interconnections and poor scalability are the usual limitations. A single 
chip solution has been discussed where the imaging, ADC and digital 
processing are included at the focal plane [18]. However, only a very small 
percentage of that chip is used for imaging. 

Integrating simultaneous spatial and temporal processing on the same 
substrate allows more versatility for computer vision applications. The 
computational time for computer vision algorithms can be drastically 
improved when spatial and temporal processing are done at the focal plane. 
This information is then presented to the CPU together with the intensity 
image. Hence, the first layer of computation can be performed at the focal 
plane, freeing computational resources for other higher-level tasks. 

In this chapter, three systems that perform imaging and visual 
information processing at the focal plane are described: current-mode, 
voltage-mode and mixed-mode image processing. It is demonstrated how 
spatiotemporal image processing can be implemented in the current and 
voltage modes. A computation-on-readout (COR) scheme is highlighted; this 
scheme maximizes pixel density but still allows multiple processed images 
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to be produced. It requires little additional area and access time compared to 
a simple imager. In some cases, it is necessary to perform computation in a 
pixel-parallel manner while still retaining the imaging density and low-noise 
property of an APS imager. Hence, an imager that uses both current-mode 
and voltage-mode imaging and processing is presented. The mixed-mode 
approach has some limitations, however, and these are described in detail. 
Three case studies are used to show the relative merits of the different 
approaches for focal-plane analog image processing. 

5.2 Current-domain image processing: 

the general image processor 

5.2.1 System overview 

The three main components of the general image processor (GIP) are (1) 
a photopixel array of 16 rows by 16 columns, (2) three vertical and three 
horizontal scanning registers, and (3) a single processing unit. The three 
vertical and three horizontal scanning registers select several groups of 
single or multiple pixels within a given neighborhood of the photo array. 
The non-linearly amplified photocurrent values of the selected pixels are 
then passed to the processing unit, where convolutions with the desired filter 
are computed. The processing unit, which consists of four identical but 
independent subprocessors, is implemented with digitally controlled analog 
multipliers and adders. The multipliers and adders scale each of the pixel 
photocurrents according to the convolution kernel being implemented (see 
Figure 5-1). The final output of the processing unit is a sum of scaled 
photocurrents from the selected neighborhood. The independent groups of 
pixels can be combined in various ways, allowing for the realization of 
various complicated separable and non-separable filters. Each of the four 
subprocessors can be independently programmed in parallel, allowing for 
different spatiotemporal convolutions to be performed on the incident image 
in parallel. 

5.2.2 Hardware implementation 

The pixel is composed of a photodiode, a non-linear photocurrent 
amplification circuit, a sample-and-hold image delay element, and pixel 
selection switches (see Figure 5-2). The photodiode is implemented as the 
source diffusion extension of an NFET load transistor, Ml. The photocurrent 
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Figure 5-1. Overview of the general image processor (GIP) architecture. 

and the non-linear resistance of Ml produce voltage at the source of Ml. 
Transistors M2, M3, M4, and Mil (after sample-and-hold) behave as 
transconductors and transform the source voltage of Ml back to current with 
a non-linear gain. This circuit magnifies the photocurrent by up to three 
orders of magnitude. 

The load transistor Ml can be operated in an integrative or non- 
integrative/fixed-bias mode. In the integrative mode, a pulse train is applied 
to the gate of the Ml transistor, alternating between the integrative and reset 
interval of the transistor. During the reset interval, the gate of the Ml 
transistor is pulled to V dd or higher, charging the photodiode and source node 
of this transistor to > V dd - V Tn (sb )• 

With the four output transistors and the photodiode, a capacitance of 
- 65 fF is expected at the source of Ml. When coupled with the g m of Ml 
(~2x 10” 7 mhos with a InA photocurrent), this provides a reset time 
constant of - 0.3 ps. Clearly the reset time would be problematic if 
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Figure 5-2. Organization of the GIP photopixel. The photocurrent is amplified and mirrored 
four times. A sample-and-hold circuit is included in the pixel to provide the delayed image. 

difference or correlated double sampling (DDS or CDS) were used for V Tn (sb) 
variation cancellation while scanning the array above 3 MHz. No DDS was 
attempted here. 

During the integration period, the gate voltage of Ml is pulled down to 
V ss , which turns it off. The photocurrent from the diode discharges the 
floating source node of Ml. In the non-integrative mode, a constant bias 
voltage is applied to the gate of Ml, so that Ml becomes a non-linear 
resistive load. The currents produced by M2, M3, M4, and Mil are scanned 
and passed to the processing unit for further processing. 

The transistors M2, M3, M4, and Mil provide non-linearly amplified 
photocurrents. Three of the amplified photocurrents (I x , I y , and I org ) are used 
for spatial processing. The fourth is used for temporal processing, and is 
passed through a sample-and-hold circuit that stores the amplified 
photocurrent for many frame times. In the integrative mode, the sampling 
switch is opened at the end of the integration cycle, thus holding the charge 
on the gate of Mil. In the fixed-bias mode, the sampling switch is pulsed 
when a new stored image is required. In the former case, two distinct 
integration cycles are required for each processing pass, whereas in the latter 
case, processing can be done in each scanning pass. The sampling switch 
established by M12 can also act as an electronic shutter that controls the 
exposure time of the pixel; this is in addition to controlling the timing when 
the source voltage of Ml is sampled to the gate capacitance of Mil. The 
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gate capacitance of Mil acts as a holding node, and has been measured to 
maintain the output current for seconds in low light. The leakage currents of 
the drain/source diodes of M12 limit the time for storing the photocurrent, 
but this leakage can be reduced using shielding. (The process data indicates 
a diode junction leakage of - 2 fA/pm 2 under unknown reverse-bias voltage. 
The capacitance of Mil is - 27 fF and the leakage diode area is 5.8 pm 2 ; 
hence, the expected leakage rate is - 2.3 s/V. A longer time is observed in 
reality.) Some temporal image smear is expected due to charge redistribution 
at Mil during sampling. This is acceptable and perhaps beneficial for the 
delayed image, however, since it low-pass filters the stored image. 

The six scanning registers are used to select groups of pixels and direct 
their photocurrents to the eight global current buses. The selection of the 
groups of pixels is accomplished into two phases. These two phases are 
applied to both vertically and horizontally (in parentheses below) directed 
currents. In the first phase, the top (left) scanning register selects all the 
pixels in the given columns (rows) of interest (see Figure 5-1). The 
photocurrent values of these pixels are then summed horizontally 
(vertically), providing the summed photocurrent values on each of the 16 
rows (columns). In the second phase, the right (bottom) scanning registers 
select three of the previously activated 16 rows (columns) and direct each 
one of them to a separate vertical (horizontal) bus. This phase is achieved 
through a single analog multiplexer per row (column), where the control bits 
of the multiplexer are specified by the two registers on the right (bottom). 
Since there is a total of three global vertical and three global horizontal 
buses on the right and bottom of the photo array, respectively, a total of six 
different groups of pixels are selected and passed to the processing unit for 
further processing. 

The bottom two registers and the two right registers are used to select 
one additional group of pixels. The selection of this group of the pixels is 
achieved in two steps. In the first step, bit slices of the bottom (right) two 
registers are NANDed. Subsequently, the NAND results activate PFET 
switches for the I de i (delayed image) and I org (original image) currents for the 
pixels of interest (see Figures 5-1 and 5-2). These two currents are passed to 
the processing unit for spatiotemporal processing. 

The virtual ground (VG) circuit is designed using the four-transistor 
current conveyor circuit shown in Figure 5-3. Virtual ground circuits are 
used to mask the large capacitance of the current buses. The impedance at 
the input of the VG circuit can be easily derived to be - 2 lg m ( PFETs p assuming 
the NFETs and PFETs have the same dimensions. Assuming a single pixel is 
selected per current output line, a quiescent pixel (dark) current of ~ 1 pA 
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Figure 5-3. Schematics of the virtual ground circuit and scaling circuits. 

flows into the VG circuit (see Figure 5-9) and the W/L ratio of all the 
transistors is 10, giving an input impedance of ~200kQ for this CMOS 
process. In the light or with larger kernels, more pixels are selected so that 
the quiescent current is larger and leads to a smaller VG input impedance. 
The line capacitance, primarily determined by the drain diffusion 
capacitance of the readout switches, is ~ 2 fF/pixel in a row or column. With 
16 pixels on each row, the VG response time is 6.4 ns; with 1000 pixels per 
row, it is 400 ns. Hence, it is easy to obtain the required VG response time 
constant by increasing the quiescent current. 

Nine virtual ground circuits are used in this architecture. Eight of these 
are used for the eight global current buses that bring the pixel currents to the 
processing unit. The last virtual ground circuit connects together all the 
current buses (both horizontal and vertical) that are not selected by the 
scanning registers and then connects them to a reference voltage. This virtual 
ground circuit keeps the voltage of all the bus currents at a fixed value, even 
when they are not selected. Hence, the pixel current buses are precharged at 
all times. During readout, the pixel currents are simply connected to the 
virtual ground circuit or to the reference voltage. 

The processing unit is a digitally controlled analog processing unit 
consisting of four subunits. The subunits are identical in structure, each with 
a digital control memory of 40 bits as well as analog scale and add circuits. 
Each of the eight input currents are mirrored four times and then passed to 
the subprocessors for individual computation. The digital memory assigns a 
5 -bit signed-magnitude control word per current, which specifies the kernel 
coefficient for each current (see Figure 5-5). The coefficient can vary within 
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a range of ±3.75 (in increments of 0.25). The appropriate weight factors vary 
depending on the given mask of interest. After each current is weighted by 
the appropriate factor, all currents are summed together to produce the 
desired processed image. 

5.2.3 Algorithm 

To capitalize on the power of the parallel processing capabilities of the 
GIP, the sizes of the pixel groups are kept small. Minimizing the number of 
pixels per group maximizes the number of independent kernels that can be 
implemented in parallel. Ideally, if every pixel value for a given 
neighborhood is available to the processing unit, the kernels can be 
completely general (i.e., every pixel can be given its own coefficient). This 
is not possible in this architecture without using a large number of pixel 
current copies. Since pixel size and spacing would be excessive due to a 
large number of current routing lines, a completely general implementation 
would be impractical. The trade-off between generality and pixel size must 
be taken into an account in designing the GIP. Hence, our design allows for 
computation of variable sizes of kernels based on a 3 x 3 canonical model, 
where eight unique coefficients can be applied to the nine pixels. The 
distribution of these coefficients depends on the configuration of the 
switches (or registers) for selection and routing. 

To illustrate this point, a 3 x 3 block can be considered in which the 
scanning registers are loaded with the bit patterns shown in Figure 5-1. After 
this is completed, the global current buses will carry the groups of pixel 
currents that are described in equations (5.1a) through (5.1h): 



Ix\ — f(i,i) + f(i, 3 ) (5.1a) 

1x2 = 1(2,1) + 1(2,3) ) (5.1b) 

lx 3 = 1(3,1) + 1(3,3) (5.1c) 

ly\ = l(i,i) + A 3,i) (5. Id) 

ly2 = 1(1,2) + 1(3,2) (5 . 1 e) 
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Table 5-1. Summary of convolution kernel construction. 



x edges 


—Ixi ~ Iy 2 ~ Iy3 + 21x2 + 2I org 


y edges 


Ix\ - 1x2 - Pi + 2/ Y 2 + 2I org 


45° edges 


Ix\ - Iy3 


135° edges 


Ixi ~ Iyi 


Gaussian 


2 1 0 rg + 1x2 + p2 


Laplacian 


—Iy 2 ~ 1x2 + 4 I org 


Rectangular smooth 


(7xi + 1 x 2 + 1x3 + I \2 + Grg)/9 


Temporal derivative 


lore — kb del 



Ag — 7(1,3) + ^(3,3) 


(5. If) 


Arg — A 2 , 2 ) 


(5-lg) 


I del = I(2,2,t- 1) 


(5. lh) 


Other combinations are possible, but the grouping presented will yield 
the maximum number of independent kernels processed in parallel. For large 
kernels, the current I (i j) can be the sum of multiple individual pixel currents, 
such as those of a 3 x 3 subregion. Using these currents as the basis, various 
kernels can be constructed. Table 5-1 gives some examples of convolution 
kernels realized with the pixel grouping presented in equations (5.1). 

A general convolution with an Mx N kernel is given in equation (5.2). 
Four such convolutions (with different coefficients) can be executed in 
parallel. 


3 

lout ~ (Pi k xi fyl yi ) ^ org ^ del 

i=l 


(5.2a) 


{a i ,b i ,c,d} = nl 4 for -15<??el<+15 


(5.2b) 


M,N 

I {x,y},i=Tj e kl I ph( k >0 Where e kl e & 1 } 
kj 


(5.2c) 
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Figure 5-4. Spatial convolution results from the GIP. (a) Horizontal edge detection obtained 
from 3x3 mask, (b) Vertical edge detection obtained from 3x3 mask, (c) 2D edges obtained 
from a 3 x 3 Laplacian mask, (d) Smoothed image obtained from a 3 x 3 Gaussian mask. 

(e) Horizontal edge detection obtained from 5x5 mask, (f) Vertical edge detection obtained 
from 5x5 mask, (g) 2D edges obtained from a 5 x 5 Laplacian mask, (h) Smoothed image 
obtained from a 5 x 5 Gaussian mask. 

5.2.4 Results 



The GIP was tested using a programmed bit pattern discussed later in this 
chapter. Figure 5-4 shows examples of the outputs of the chip when the 
incident image is convolved with the convolution kernels in Figure 5-5. 
Images (a) to (d) are computed using 3x3 convolution kernels, whereas 
images (e) to (h) are computed using 5x5 convolution kernels. A 5x5 
kernel is identical to a 3 x 3 kernel except that a 2 x 2 subregion replaces the 
pixels that are off the central axis of the kernel (see Figure 5-5). Similar 
techniques are used to construct larger kernels. The top row of images in 
Figure 5-4 shows the vertical and horizontal edge detection images 
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Figure 5-5. Convolution kernels coefficients used for spatial processing in Figure 5-4. 
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d, e. f. 

Figure 5-6. Some complex and non-separable convolution kernels, (a) Diagonal edge 
detection, (b) Horizontal and vertical edge detection, (c) High-pass filtered image, (d) -45° 
edge detection, (e) +45° edge detection, (f) Low-pass filtered image. 



respectively, computed by the two kernel sizes. The bottom row of images 
shows the Laplacian and Gaussian images respectively. As expected, the 
vertical black line in images (a) and (f) is not visible in the horizontal edge 
images, (b) and (g). Both horizontal and vertical edges are visible in the 
Laplacian image, whereas the Gaussian image provides a smooth version of 
the image. 

Figure 5-6 shows further examples of images filtered by various complex, 
non-separable, or rotated filters. The kernels are shown in Figure 5-7. Image 
(a) is the result of convolution with a spatial mask that only computes the 
diagonal edges of the image, whereas the kernel for image (b) highlights a 
combination of horizontal and vertical edges while suppressing the diagonal 
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Figure 5-7. Non-separable convolution kernels coefficients used for spatial processing 
in Figure 5-6. 
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Table 5-2. Summary of GIP characteristics. 



Technology 


1.2 pmNwell CMOS 


No. of transistors 


6000 


Array size 


16 x 16 


Pixel size 


30 pm x 30 pm 


Fixed-pattern noise (STD/mean) 


2.5% (average) 


Fill factor 


20% 


Dynamic range 


1-6000 lux 


Frame rate range 


DCM00 kHz 


Kernel size range 


From 2 x 2 up to whole array 


Kernel coefficients 


±3.75 by 0.25 


Coefficient of precision (STD/mean) 


Intraprocessor: <0.5% 
Interprocessor: <2.5% 


Temporal delay (holding time) 


1% decay in 150 ms @ 800 lux 
1% decay in 1 1ms @ 8000 lux 


Power consumption (V dd = 5 V @ 800 lux) 


5x5 array: 1 mW @ 20 kfps 


Computation rate (add and multiply) 


5x5 array: 1 GOPS/mW @ 20 kfps 



edges. Images (d) and (e) show -45° and +45° edge detection respectively, 
and image (f) shows a low-pass filtered version of the image. Furthermore, 
the kernels for images (a) and (c) compute second-order derivatives 
(oriented Laplacian operators) of the intensity, whereas images (b), (d), and 
(e) are first-order derivatives (oriented gradient operators). Hence, the sign 
of the slope of the intensity gradient is observable in the latter images, but 
not in the former. The orientation selectivity of the 2D edge detectors is 
clearly visible in these figures. Other complex and non-separable filters can 
also be implemented with the GIP. 

Table 5-2 shows the general characteristics of the chip. A much larger 
array is possible with no impact on the performance of the scanning circuit 
and processor unit. In a larger array, most of the additional area will be in 
the photo array because the overhead for the scanning and processing 
circuits will be similar to that required in this chip. The circuitry of the 
processing unit is independent of the photo array size and will be the same as 
in the smaller photo array, although it will be redesigned to handle large 
convolution kernels. The current handling capability must be increased. 
Similar digital processors typically use more than 50% of the area budget in 
the processing units [19]. 

The amplified photoresponses of many randomly distributed pixels in the 
fixed-bias mode (V bias = 5 V) are shown in Figure 5 -8(a). The light intensity 
is varied over six orders of magnitude. For very low light intensity, both Ml 
and M2 transistors in Figure 5-2 operate in weak inversion mode, resulting 
in a linear relation between the light intensity and the amplified 
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(c) (d) 



Figure 5-8. Photo response of the GIP pixel operating in fixed-bias mode, (a) Low-to- 
medium light response of 50 randomly distributed pixels with V bias = 5 V. (b) Low-to-medium 
light response of a pixel for various V bias values, (c) Medium-to-bright light response of the 
pixel, (d) Output dark current for various V bias values. 

photocurrent. This dependence is described by equation (5.3), but is not 
observed in Figure 5-8. This is because dark current in the photodiode 
produced a large enough gate-to- source voltage drop in Ml (~ 0.96 V) due 
to the body effect (\V T on, Yn) = {0.6 V, 0.62 V“ 2 }) and {V TO p, Yp} = {0.7 V, 
0.4 V -2 }) to operate M2 above threshold. Models for transistors in below- 
and above-threshold states can be found in [20]. A minimum (dark) current 
of 1.16 pA flows in M2; this current can be reduced or eliminated by raising 
the gate voltage of Ml above V dd . 

Equation (5.4) shows the dependence of the output current on 
photocurrent when Ml is in weak inversion and M2 is in strong inversion 
and saturated. This equation indicates that the output current has a log 
dependence for small photocurrents and a log-squared dependence for large 
photocurrents. The log dependence for small photocurrents is observed in 
Figure 5 -8(a). The slope of the curve is determined by the bias gate voltage 
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of Ml and the difference in threshold voltages of Ml and M2. It should be 
noted that that the difference in threshold voltages decreases with increasing 
light because V s of Ml approaches its V b (i.e., V ss ). This is also observable in 
Figure 5-8 as a decrease in slope with increasing light intensity. In Figure 5- 
8(b ) , a range of gate bias voltages for Ml is shown. As predicted by equation 
(5.4), the slope of the curve increases as V dd - V bias increases. Figure 5 -8(c) 
shows the transition from log to log-squared dependence as the light 
intensity is increased drastically. The imager will be used in the logarithmic 
mode to widen the dynamic range of the pixel. It is also true that biological 
photoreceptors exhibit logarithmic responses [5]; the logarithmic 
relationship has profound influences not only on the dynamic range of the 
pixel, but also on grayscale and color scene analysis. Figure 5 -8(d) shows 
the pixel output current in the dark as V bias is varied. The expected quadratic 
dependence observed. 

As the photocurrent increases further, both Ml and M2 transistors 
operate in a strong inversion mode, resulting in a linear relation between the 
light intensity and the amplified photocurrent. This dependency is given by 
equation (5.5) and was not reached in our experiments. 
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where V t is the thermal voltage, 7 0 and k{ P>n } are process parameters, i photo is 
the photocurrent passing through transistor Ml, and i out is the amplified 



© V. Gruev and R. Etienne-Cummings, “Implementation of steerable spatiotemporal image filters on the 
focal plane,” IEEE Trans. Circuits and Systems-II, vol. 49, no. 4, pp. 233-244, Apr. 2002 (partial reprint). 




Focal-Plane Analog Image Processing 



155 




Figure 5-9. Fixed-pattern noise of the GIP pixel operating in integrative and non-integrative 
modes. 

photocurrent passing through transistor M2. 

The fixed-pattern noise (FPN) characteristics for integrative and non- 
integrative modes are shown in Figures 5-8 and 5-9. In Figure 5 -8(a) t the 
response of many pixels under fixed-bias operation is shown, and the 
variation at each light intensity is visible. As expected, the variations appear 
as offsets to the pixel currents. The magnitude of the variation remains 
relatively constant with light intensity; hence, the % FPN (STD/mean) will 
decrease with light intensity, as seen in Figure 5-9. 

The noise characteristics for low light intensity are better when the pixels 
operate in the integrative mode. In this mode, transistor Ml is either turned 
off during the integration period or is operated in a strong inversion mode 
during the reset period. To reduce the variations in the voltage at the start of 
integration (i.e., the reset voltage), the pixel is reset by driving the gate of 
Ml above Vdd- With a sufficiently high gate voltage [i.e., greater than 
(V dd + V Tn (sb))\, the reset voltage will be Vdd for all pixels. In this case, the 
fixed-pattern noise depends primarily on the M2 transistor and it is not 
affected by transistor Ml. When Ml operates in fixed-bias mode, both Ml 
and M2 operate in a weak inversion mode in low light intensities. Hence, the 
variations in Ml and M2 (which are uncorrelated because the transistors are 
different types) will contribute to the noise characteristics. In low light, FPN 
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for the integrative case is found to be approximately half that of the fixed- 
bias case (1.2% vs. 2.3%). It is well known that currents of the transistors 
match better (in terms of oW/Z/ds) in strong inversion than in weak inversion 
because the mean current is large in the former case [21]; Ml and M2 
operate in strong inversion in bright lights under both integrative and fixed- 
bias operation. For the integrative case, the output current is larger than in 
the fixed-bias case because the gate voltage of M2 can be large. 
Consequently, the STD/mean for the integrative mode quickly (first 
exponentially, then quadratically) reaches its minimum of less than 1%. In 
typical indoor lighting, the FPN will limit the image signal-to-noise ratio 
(SNR) to - 6 bits and - 7 bits for the fixed-bias and integrative modes, 
respectively. Correlated or difference double sampling can be used to 
improve the FPN [22]. 

The scaling precision of the processing unit is shown in Figure 5-10. In 
Figure 5- 10(a), the imager is stopped on a pixel and a processing unit is 
programmed with all 32 coefficients under nine ambient light levels. The 
plot shows that the weights vary about the ideal x = y line, but the variation 
is quite small. Figure 5- 10(b) shows that the variation in terms of STD/mean 
is less than 0.5% (~ 8 bits) across coefficients. Hence, within one processing 
unit, the precision of computation is approximately 8 bits. Figure 5-1 0(c) 
compares the matching across the four processing units for one light 
intensity. Figure 5- 10(d) shows that the precision across the processing unit 
is poorest for the smallest coefficients. This is again consistent with the 
expectation that for smaller currents, STD/mean becomes larger if the STD 
remains relatively unchanged. The matching across processing units is 
approximately 2.5% (~ 5 bits). Better layout practices can be used to 
improve the matching of processing units. 

The total power consumption for computing four convolved images with 
5x5 spatiotemporal kernels is 1 mW with 5 V supplies and 10 pW/cm 2 
ambient light. The power consumption can be easily decreased by modifying 
the gain of the pixels and by decreasing the power supply voltage. 
Convolving the incident image with four 5x5 kernels (25 five-bit 
multiplications and 25 five-bit additions per kernel) over the entire frame 
(16 x 16 pixels) at 20 kfps is equivalent to executing 1 GOPS/mW. The low 
power consumption per operation is several orders of magnitude smaller 
than a comparable system consisting of a DSP processor integrated with a 
CCD camera or signal-chip CNN-UM [12, 16-17]. 
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5.2.5 Scalability 

One of the major benefits of this architecture is scalability. Scalability is 
achieved by performing all computation on readout, which allows for small, 
scalable pixel sizes and a generic processing unit that does not depend on the 
size of the pixel array. The sizes of the pixel area for different technology 
processes are summarized in Table 5-3. For the 0.35 pm process, the size of 
the pixel is 8.75 pm x 8.75 pm, which is comparable to the size of CCD 
pixels. The advantage of GIP pixels (over those of a CCD) is their capability 
for simultaneously accessing several pixels in a given neighborhood, which 
enables parallel spatial and temporal convolution of the incident image. The 
low SNR of the GIP pixel can be improved by the addition of noise 
cancellation circuits, such as correlated double sampling [22]. However, the 




(c) (d) 



Figure 5-10. Precision of the scaling coefficients of the current in the processing units. 

(a) Linear plot of the coefficients of a single processing unit for nine light intensities. 

(b) Variations of coefficients within one processing unit across a range of light intensities. 

(c) Linear plot of coefficients of all four processors for one light intensity, (d) Variations of 
coefficients across processing units for one light intensity. 



© V. Gruev and R. Etienne-Cummings, “Implementation of steerable spatiotemporal image filters on the 
focal plane,” IEEE Trans. Circuits and Systems-II, vol. 49, no. 4, pp. 233-244, Apr. 2002 (partial reprint). 





158 



Chapter 5 



optimized CCD process has better sensitivity than any of the CMOS 
imagers. Nonetheless, the size, light sensitivity, noise characteristics, and 
focal-plane processing capabilities of the GIP pixel make it ideal for image 
processing. 

The processing unit capitalizes on the scalability of the GIP design. The 
area and computational speed of the processing unit is independent of the 
size of the photo array. For a 1 cm 2 chip fabricated in a 1.2 pm process with 
333 x 333 photo array pixels, the ratio of the processing unit to the photo 
pixel array is 0.5%. For a 0.35 pm process with a 1143 x 1143 photo array 
on a 1 cm 2 die, this ratio is 0.04%. The size of the processing unit area is not 
affected when the size of photo array increases, leading to a high area 
efficiency for the processing unit. The processing unit occupies such a small 
percentage of the total area that the transistors in the processing unit need 
not be scaled at the same rate as the transistors in the pixels. This 
observation is crucial for maintaining or improving the precision of the 
processing unit. By maintaining the physical size of transistors, the precision 
of the processing unit can be increased due to the higher resolution and 
improved tolerances of technologies with smaller feature sizes. A migration 
to smaller feature-size technologies clearly promises improvements in both 
area efficiency and precision for the processing unit. 

The computational speed of the processing unit is also not effected by 
scaling the design, because virtual ground circuits are used to mask the 
current bus parasitic capacitance. The virtual ground circuit keeps the bus 
line charged to a constant voltage at all times, which maintains the access 
time of the photopixel current and keeps the computational time of the 
convolution kernels constant. As the architecture is scaled down to smaller 
technologies, the total line capacitance for a 1 cm 2 die remains constant 
because the scaling of the oxide will be compensated by the scaling of the 
size of the switch transistors that connect the pixels to the bus line. The 
length of the bus line remains constant. Specialized scanning techniques 
should be considered for denser pixel arrays in order to decrease the charge- 
up time for the selection lines. In addition, smaller processes allow for 
higher operational speed; for example, the operational speed can be 
increased to approximately 50 MHz if the 0.35 pm process is used (see 
Table 5-3). 

To maintain a constant frame rate as the size of the array is increased, the 
scanning frequency must be increased. Unfortunately, the pixel access 
circuits and processing circuits place limits on scanning speed. The 
importance of the virtual ground circuit in achieving fast scanning speeds 
under increased line capacitance has already been discussed. Current-mode 
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Table 5-3. Scaling properties of the GIP architecture. 





1 .2 pm process 


0.5 pm process 


0.35 pm process 


Pixel size 


30 pm x 30 pm 


12.5 pm x 12.5 pm 


8.75 pm x 8.75 pm 


Photosensitive area 


13.6 pm x 13.6 pm 


5.5 pm x 5.5 pm 


3.85 pm x 3.85 pm 


Processing unit area 


1680 pm x 300 pm 


700 pm x 125 pm 


490 pm x 87.5 pm 


Processing unit area 


0.5% 


0.09% 


0.04% 


per 1 cm 2 die (%) 


No. of pixels 


333 x 333 pixels 


800 x 800 pixels 


1142 x 1142 pixels 


per 1 cm 2 die 


Operational speed 


15 MHz 


30 MHz 


50 MHz 



processing is also important for maximizing processing speed. Virtual 
grounding is not being used on all GIP current summing nodes at this time, 
but this would be necessary for a large array scanned at high speeds. A 
common technique to further increase the frame rate is to divide the array 
into multiple parallel sub-arrays that are scanned and processed in parallel. 
In this case, however, the benefits of uniformity due to a single processing 
unit would be lost. It is clear that the GIP approach does not scale in 
computational complexity unless local analog memories are introduced in 
each pixel to store partial results for algorithmic processing. This is the 
major benefit of the CNN-UM over the GIP. In the future, however, some 
ideas may be borrowed from the CNN-UM to improve the algorithmic 
processing capabilities of the GIP. 

5.2.6 Further applications 

The spatial and temporal processing capabilities, the reprogrammability, 
and the parallel convolution computation of the GIP allow it to be used as a 
front end for many computer vision applications. The first layer of 
computation in these applications typically involves computing various 
spatiotemporal kernels in parallel. Computing these kernels at the focal 
plane frees up computational resources and improves the performance of 
these algorithms. For example, computing spatial corners requires first 
computing the horizontal and vertical edges of an image and then finding the 
points where both of these edges exist. If the temporal component for a 
given spatial comer is considered, spatial comers varying over time can be 
highlighted [23]. Computing horizontal and vertical edges as well as 
temporal frame differences in parallel at the focal plane greatly reduces the 
computational time for tracking spatiotemporal corners. Its ability to track 
spatiotemporal corners coupled with its low power consumption makes the 
GIP useful in a variety of robotics applications, such as target tracking, 
autonomous navigation, and obstacle avoidance [24]. 
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Many orientation detection algorithms require the pre-convolution of 
imagers with spatial derivative kernels [25]. Computing simultaneous 
horizontal, vertical, and diagonal edges and using a weighted average can 
provide high-resolution information on the orientation of edge-forming 
objects. Combing temporal information with the edge orientation of an 
object leads to efficient methods for computing the direction of motion [23]. 

The GIP chip can compute specialized wavelet transforms as well as 
approximations of the discrete cosine transform (DCT) [26-27]. Computing 
wavelet transforms or the DCT requires several different patterns to be 
loaded sequentially in the scanning registers; the output of each scanning 
pattern is stored in an external memory (or in internal memory if available). 
The transforms are then completed by summing appropriately scaled values 
from the external memory. Implementation of these algorithms will be 
described in a future paper dedicated to applications of the GIP. 
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5.3 Voltage-domain image processing: 

the temporal difference imager 

5.3.1 System overview 

The temporal difference imager (TDI) consists of four main components: 
a photopixel array of 189 rows by 182 columns, two vertical and five 
horizontal scanning registers, a control-timing unit, and three difference 
double sampling (DDS) units. Each pixel has two outputs: a current frame 
output and a previous frame output. The two intensity images are presented 
in parallel to two independent DDS circuits where reset voltage mismatches, 
kTC noise, charge injection due to switching, 1 If noise, and fixed-pattern 
noise (FPN) are suppressed. The difference between the two corrected 
intensity images is computed in a third DDS circuit and presented outside 
the chip. The control-timing unit synchronizes the timing between all 
scanning registers and manages an efficient pipeline mechanism for 
computing the difference between the two consecutive images. This unit also 
controls the integration time of the two frames, the time between two 
consecutive frames, the sample-and-hold timing, and the computation timing 
of the DDS circuits. Different readout techniques can be executed by 
changing bit patterns in the scanning registers and reprogramming the 
control-timing unit. Hence, a fair comparison between standard readout 
techniques and our proposed techniques can be made on this imager. 

5.3.2 Hardware implementation 

The active pixel sensor (APS) cell shown in Figure 5-11 is composed of 
a photodiode, two storage elements Cl and C2, switching transistors M2- 
M7, and readout transistors M8-M11. (Section 4.2 in Chapter 4 provides 
more details on APS.) PMOS transistor Ml is used to control the operation 
modes of the photodiode. This transistor increases the output voltage swing 
of the pixel by allowing the reset voltage level of the photodiode to be V d d- 
In addition, image lag due to incomplete reset (which is evident when an 
NMOS reset transistor is used) is eliminated by using a PMOS reset 
transistor [28]. The increased output voltage swing comes at the expense of a 
larger pixel area, but using mainly PMOS transistors in the pixel (the 
exceptions are the output source follower transistors) minimizes this effect. 
The NMOS source follower transistors (M8 and M10) ensure that the 
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Figure 5-11. Readout circuit for the temporal difference imager (TDI). 

voltage swing is between V^- V th - V(I bias ) of this transistor and the 
minimum voltage required by the bias transistor for the follower to remain in 
saturation. In a traditional APS, the maximum output is described by 
equation (5.6). 

^ out max ^ dd ^ ^ photodecay ) ^th,M 8 ^ ^ bias ) (5.6) 

Hence, the voltage swing is maximized at the expense of larger pixel 
size, and the design can be safely used at lower supply voltage levels. 

The two sample-and-hold circuits are composed of a sampling transistor 
(M2 or M7) and a capacitor (Cl or C2), and are implemented with a 
transistor by connecting the source and drain to V ss and the bulk to Vdd- 
Hence, the effective storage capacitance is the parallel combination of C gb , 
C gd and C gs , where the subscripts g, d, s, and b indicate the transistor gate, 
drain, source, and bulk, respectively; this yields the maximum possible 
storage capacitance for the given transistor. The extra storage capacitance in 
the pixel will linearly reduce the photoconversion rate, but the kTC noise is 
improved only as the square root. Hence, the overall SNR will be reduced. 

The sampling of the photovoltage alternates between the two sample- 
and-hold circuits during consecutive frames. Once the stored charges in Cl 
and C2 are read out to the DDS circuits, the pixel is reset and transistors M3 
through M6 are turned on to discharge the holding capacitors Cl and C2. 
Transistors M3 through M6 allow for individual pixels to be reset, instead of 
the row- wise pixel reset that is common in standard APS. The reset voltage 
is subtracted from the integrated values in two independent DDS circuits, 
eliminating the voltage offset variations due to the output source follower. 
This technique, known as difference double sampling, improves the noise 
characteristics of the image. The use of two independent DDS circuits for 



© V. Gruev and R. Etienne-Cummings, “A pipelined temporal difference imager,” IEEE J. Solid State 
Circuits, vol. 39, no. 3, Mar. 2004 (partial reprint). 




Focal-Plane Analog Image Processing 



163 



Time 

Stage 4 \ Stage 1 V EvaTy Stages Stage 4 

Integrate an cy - Integrate on C2 &Res. \ NOP A integrate on Cl 

Stage 4 / Stage 1 / Eval, V stage 3 

Integrate on Cl ,/\ Integrate on C2 A, & Res, A NOP j\ 

NOP Y Stage 4 Stage 1 y Eval y Stages 

v / Integrate on Cl Integrate on C2 y. 4 Res. A NOP 

Stages y Stage 4 Stage 1 J Eval \ / stages 

NOP Integrate on Cl /\ Integrate on C2 \ 4 Res / NOP 

Stage 3 V Sla 9 ^ V S(a 9 e 1 V Eval, \ 

H^Qp Integrate on Cl y, Integrate on C2 & Res, / 



Figure 5-12. Pipeline timing. 

the entire imager further improves the precision and accuracy by eliminating 
row FPN, which must not be present if row- or column-parallel DDS is used. 
After the images have been corrected, the two frames are subtracted in a 
third DDS circuit and the difference is provided outside the chip, together 
with the two intensity images. 

5.3.3 Pipeline readout technique 

The control-timing unit is crucial for synchronizing the different events 
executed in the imager. This unit controls the four-stage pipeline mechanism 
implemented on the chip. The timing diagram of the four stages is presented 
in Figure 5-12. The horizontal axis presents the timing events in one frame, 
whereas the vertical axis represents different pixels across the imager. 
Hence, four different tasks (stages) can be performed at the same time across 
different parts of the image plane. In stage 1 of the pipeline, the 
photovoltage is integrated and sampled just before the beginning of stage 2 
on capacitor C2. Increasing or decreasing the number of integration columns 
in stage 1 can vary the integration period. In stage 2 of the pipeline (the 
pipeline consists of a single row), the previously stored photo voltage on Cl, 
the newly integrated photo voltage on C2, and the reset values are all read out 
to the DDS circuit. The difference C2 - Cl is evaluated after the subtraction 
of the reset offset from the stored values. 

The stored photovoltage on Cl is held for the entire integration period of 
C2. Due to the leakage currents at the holding node, this stored value will be 
less than the original value. The integrated photovoltage sampled on C2 will 
not be degraded because the difference of the two values is evaluated just as 
the integration period of C2 is completed. Therefore, the minimum 
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difference between these two values will be the magnitude of the decay of 
Cl, which will be the same for all pixels in the imager. This is very 
important for obtaining good precision in the difference image: an offset in 
the difference computation can be corrected across the entire image in the 
final result. 

In stage 3 of the pipeline, no operations (NOPs) are executed. The length 
of this stage can vary between zero and the scanning time of one entire 
frame. When the integration time of stages 1 and 4 are equal to the time 
required to scan half of the columns in the imager, stage 3 does not exist. 
When the integration period of stages 1 and 4 are one column time, the NOP 
stage will be close to the scanning time of an entire frame. This stage adds 
some flexibility in overlapping two integration processes on the entire 
imager at the same time, while still controlling the integration time of each 
frame independently of each other. The integration times of stages 1 and 4 
can be the same or different, depending of the application requirement. In 
most cases, the integration times of these stages are equal. 

In stage 4, the photo voltage is integrated and sampled on capacitor Cl. 
Increasing or decreasing the number of integration columns in the fourth 
stage can vary the integration period. Once the integration period is 
complete, the integrated value is stored and held in Cl. The holding time of 
the Cl value depends only on the integration time of stage 1 and is the same 
for all pixels across the imager. Stages 1 and 4 of the pipeline cannot 
overlap, limiting the maximum integration time of both stages to half of the 
scanning time of one image frame. The pipeline readout mechanism allows 
for continuous difference evaluation as each consecutive frame is read out. 

This pipeline mechanism improves the precision of the evaluated 
difference. To a first-order approximation, the precision of the difference 
strongly depends on the leakage currents at the holding nodes of Cl and C2. 
These leakage currents are functions of two factors: holding time and 
illumination. To weaken the dependency on illumination, various layout 
techniques were applied; these included using metal shielding methods and 
designing a symmetric layout to ensure that leakage currents on both holding 
nodes were equal. The current pipeline readout mechanism cannot eliminate 
the problem of leakage due to illumination. On the other hand, it reduces the 
problem of time dependency of the leakage currents by ensuring that the 
time delay between the two images is equal for all pixels. Hence, each pixel 
integrates for equal amount of time, and the holding time for every pixel is 
also equal. Since the holding times of the charges in Cl and C2 are equal 
and the leakage currents for all pixels can be approximated to be equal, then 
the offset in the temporal difference can be canceled in the final 
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Figure 5-13. Snap shot and evaluate mode. 



computation. Assuming that the two stored values in capacitor Cl and C2 
are the same ( V c ), the offset in the temporal difference across every pixel is 
described by equation (5.7). 

The maximum voltage offset for these pixels is 



fci Tc 2 (fc ^/leakage) Vq A E leakage NOP (/leakage/ C s t or age) (5.7) 

In equation (5.7), V C \ is the voltage stored on capacitor Cl, V C 2 is the 
voltage stored on capacitor C2, A leakage is the decay of the value stored in 
capacitor Cl, /leakage is the reverse diode leakage current, and NOP is the 
holding time of the stored charge in capacitor Cl or C2. Equation (5.7) states 
that the voltage offset error across the entire imager is independent of the 
pixel position; hence, the error will be an equal offset across the entire 
imager. 

5.3.4 Snap shot and evaluate mode 

The TDI can also operate in a snap shot and evaluate mode. This mode of 
operation is shown in Figure 5-13. In this mode, the photovoltage is first 
integrated on capacitor Cl. Then a new photo voltage is integrated on 
capacitor C2. After the second integration is completed, the difference 
between the two stored values is evaluated. Since the difference evaluation is 
computed in a sequential manner, the holding interval of Cl will increase as 
the image is scanned out. When the first pixel is evaluated, capacitor Cl has 
decayed by the integration time of C2. For each additional evaluation of the 
difference, an additional hold time is introduced. The last pixel will have the 
maximum hold time described by equation (5.8): 
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Figure 5-14. (a) Sampled intensity (left) and difference (right) images of a rotating grayscale 
wheel obtained with the TDI. (b) Sampled intensity (left) and difference (right) images 
obtained with the TDI. (c) Temporal difference images for the grayscale wheel at increasing 
(left to right) rotation speeds. 



(AOmax int of C2) + MNtdk (5-8) 

In equation (5.8), *(mtofC2) is equal to the integration time of C2. M and N 
are the dimensions of the imaging array. The additional hold time, which 
increases for each scanned pixel, introduces a different offset error for each 
pixel. If the light intensity dependency of this offset error is ignored, the 
offset should be linearly increasing across the entire imager. This readout 
technique will require offline calibration and correction, and may not be 
suitable for applications requiring direct and fast computations of temporal 
difference [28]. 

5.3.5 Results 

Real-life images from the TDI are shown in Figures 5- 14(a) and (b). In 
each figure, the intensity image is on the left side and the absolute difference 
is on the right side. The contour of the rotating wheel in Figure 5-14(a) is 
clearly visible in the corresponding difference image. Figure 5-14(c) shows 
the temporal differences of the grayscale wheel at different rotational speeds. 
The grayscale wheel contains 36 different grayscale values, with 10 degrees 
of spatial distribution for each value and constant grayscale increments. In 
the first image on the left in Figure 5-14(c ), the grayscale wheel is rotated 
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Figure 5-15. (a) Leakage currents at selected li 
function of light intensity. 
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slowly and the temporal difference image records only a small difference 
between the two consecutive images. Due to the abrupt difference between 
the first and last grayscale values on the grayscale wheel, the temporal 
derivative computes a high difference in this border region. The temporal 
difference in the rest of the image is constant and low due to the constant 
grayscale increments. As the rotational speed of the grayscale wheel is 
increased, the temporal difference image shows greater difference (overlap) 
between two consecutive images. In the last case, when the rotational speed 
is the highest, a wide region of high temporal difference values is recorded. 
The temporal difference in the rest of the image also has higher values 
compared to the other cases where the wheel was rotated at slower speeds. 
This is due to the increased overlap between the two consecutive images, 
which leads to higher temporal difference values. 

Figures 5-1 5(a) and 5-15(b) demonstrate the leakage at the holding 
nodes as a function of light intensity. Figure 5-15(a) presents discharge 
curves at the holding node Cl for several different light intensities. For light 
at low intensities (10 2 pW/cm 2 ) and mid-level intensities (10 2 pW/cm 2 ), the 
slopes of the discharge curves are negligible, corresponding to less than 
10 mV/sec decay ( Figure 5-15(b)). This decay rate allows for frame rates as 
slow as 3 fps with a temporal difference precision of 8 bits. For very high 
illumination intensities, the slope of the discharge currents increases to about 
100 mV/sec. The parasitic reverse diodes at the holding node are in deep 
reverse bias, which results in high leakage currents. 

When operating the TDI at 30 fps with a decay rate of 100 mV/sec, a 
temporal difference can be computed with 8-bit accuracy (the output swing 
of the pixel is 3 V; a 12-bit low-noise ADC was used to digitize the image). 
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Figure 5-16. Comparison between (a) snap shot and (b) pipelined modes of operation at 
150 mW light intensity. 

Equation (5.9) indicates the bits of precision as a function of decay rate, 
frame rate, and voltage swing of the pixel: 



error (%) = 



decay rate 
fps x voltage swing 



x 100 



(5.9) 



The leakage currents at the holding nodes limit the precision of the 
temporal difference. If a snap-shot-and-evaluate readout technique is used, 
this precision will be further degraded. Using the pipeline readout technique 
helps to eliminate this problem, however, as shown in Figure 5-16. This 
figure compares the operation of the TDI in the two different readout modes. 

The snap shot and evaluate mode is the mode usually discussed in the 
literature [29-30]. In this mode, two consecutive snap shots are obtained and 
their difference is computed. Unfortunately, the leakage currents strongly 
influence the accuracy of the computed difference. The first pixel evaluated 
will have the least holding time and leakage. As the rest of the image is 
evaluated, each additional pixel will have additional time and leakage. The 
last pixel evaluated in the image will have the longest time delay. The large 
leakage currents in the last pixels will greatly affect the accuracy of their 
evaluated differences. The accuracy of these differences across the imager is 
demonstrated in Figure 5-1 6(a). The slope is evident in both the x and y 
directions as the image is scanned and as the differences are evaluated. 
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In the second mode of operation, the TDI takes advantage of the pipeline 
architecture. Since all pixels have the same holding time when operating in 
this mode, the variation due to leakage currents of the difference is 
minimized. The results of this can be seen in Figure 5-16(b). The mean 
difference in this case is 16.5 mV with 0.14% variations of the maximum 
value. The advantages of the pipeline architecture compared to the snap shot 
operational mode are readily apparent. 

The total power consumption is 30 mW (with three DDS circuits each 
consuming 9 mW of power) at 50 fps with fixed-pattern noise at 0.6% of the 
saturation level, which makes this imager attractive for many applications. 
The greater than 8 -bit precision for the difference between two consecutive 
images and the relatively low FPN are the major advantages of this 
architecture. 
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5.4 Mixed-mode image processing: 

the centroid-tracking imager 

5.4.1 System overview 

This imager consists of two subsystems, the APS imager and the centroid 
tracker. The APS array obtains real-time images of the scene and the 
centroid tracker computes the location of moving targets within the scene. 
Each can be operated independently of the other. In this design, no resources 
are shared between the two except for the focal plane itself. 

Figure 5-17 shows the floor plan of the array and edge circuitry. To 
facilitate tiling, the pixel for centroid computation is exactly twice as long on 
each side as the APS. Pixels in the same row are of the same type, and the 
array rows alternate between centroid-localization pixels and APS. Due to 
the difference in size of the pixels, APS rows are 120 pixels across whereas 
each centroid row contains 60 pixels. The non-uniform arrangement of 
pixels was motivated by a desire to increase APS resolution for better image 
quality. Unfortunately, this decision was directly responsible for APS 
matching and performance problems, which will be discussed in more detail 
below. Digital lines are run along rows to keep the digital switching 
transients for one style of pixels from coupling to signals in the other. The 
chip was fabricated in a standard analog 0.5 pm 1P3M CMOS process. 

5.4.2 APS imaging subsystem 

The design of the APS pixel follows the same basic three-transistor, one- 
photodiode design pioneered by Fossum et al. [31] (shown in Figure 5-18). 
Included are a reset transistor, an output transistor, and a transistor select 
switch to address the pixel. The structure is simple, with no provision for 
electronic shuttering. It is optimized primarily for density and secondarily 
for fill factor. All transistors in the pixel are NMOS to reduce the area of the 
pixel. (More details on APS are presented in Chapter 4, section 4.2.) 

The row circuitry is composed of two cyclical shift register chains: one 
for row reset signals and the other for row select signals. Each row of pixels 
receives reset and row select signals from one stage of each chain. Clocking 
these shift registers advances their bit pattern forward by one row and starts 
the readout of the next row. The reset shift register can be preloaded with 
blocks of ones and zeros in a flexible way, allowing the integration time for 
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Figure 5-1 7. System-level view of 
the centroid-tracking imager chip. 




Figure 5-18. APS pixel schematic. 



each row to be specified as a modifiable fraction of the total frame time. 
This can be viewed as a “rolling shutter.” In addition, there is circuitry on 
the reset lines to facilitate reset timing on a shorter time scale than one row 
clock. A separate global signal, labeled “directReset” in Figure 5-20, is 
ANDed with the signal of each row from the shift register. Using this signal, 
integration can be stopped and reset initiated in the middle of the output 
cycle of a single row. This is especially important for the operation of the 
correlated double sampling (CDS) system described below. It also permits 
easy prototyping, allowing a simple method for examining the operation of 
the pixels in a single row without running the entire array. 

Each column of pixels has its own dedicated processing circuitry. The 
column circuitry starts with its most important block, the CDS circuit [32- 
33]. This circuit subtracts the reset voltage from the signal voltage, ensuring 
that only the difference between the two is measured and not the absolute 
signal level itself. This drastically reduces offset errors in readout. It also 
compensates for noise and different reset voltage levels resulting from 
different light intensities during reset. The CDS is implemented as a simple 
switched capacitor circuit, producing a single-ended output voltage. A fully 
differential circuit would have exhibited more immunity to power supply 
ripple and interference from other signals, but these factors are not 
compelling in this application. Therefore, a simpler method is used to 
shorten design time and to minimize the area of the layout. Efficient use of 
space is especially important for this circuit because it is used in each 
column. A simple 7-transistor (diffamp and inverter) opamp with Miller 
compensation in a unity-gain configuration follows the CDS circuit for 
output buffering. The end of the column circuit employs yet another shift 
register chain to sequentially activate the switches that output one column 
voltage at a time to the single-pin output. Figure 5-19 shows a schematic of 
the CDS circuit. 
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Figure 5-19. CDS, column buffer, and output switching circuit. 

5.4.3 Centroid-tracking subsystem 

The basic function of this subsystem is to compute the centroid of all the 
pixels that have brightness levels varying with time. This approximates 
finding the centroid of a moving object. A moving object will at least cause 
pixels to change at its edges (in the case of a solid-colored object, for 
example) and usually many pixels within the image of the object will also 
change (due to details or texture). In either case, the centroid of all time- 
varying pixels will be close to the center of the object. This scheme works 
most accurately for small objects because no points on a small object are 
very far from the centroid. The uncertainty in pixel activity detection will 
thus cause a smaller possible error in the computation of centroid position. 
In this particular design, implementation details necessitated that only an 
increase in brightness is detected; the reasons for this modification are 
explained below. With this alteration, moving bright objects on a dark 
background should theoretically be tracked by their leading edge, and dark 
objects on a bright background by their trailing edge. This may cause 
additional deviation from the true centroid in some situations. However, 
most real-world objects have visible texture and are not solid-colored. In 
these situations, many pixels inside the outline of the object will be activated 
in addition to the outline pixels, lessening the impact of ignoring intensity 
decreases. The output of this subsystem is a set of two voltages: one for the x 
position and one for the y position. 

The method employed to detect pixel brightness changes can be 
considered as a simplified form of an address event representation imager 
[34-36]. The only output from each activated pixel is the row and column of 
that pixel. Edge circuitry then processes the activated rows and columns to 
find the centroid. Moving the more complicated processing to the edges of 
the array keeps the pixel size smaller and helps to increase the fill factor for 
the motion- sensitive pixels. 
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5.4.4 Centroid pixel 

The pixel itself starts with a photodiode, which is biased by an NMOS 
transistor with its gate voltage fixed (see Figure 5-20.) The voltage at the 
source of this load NMOSFET will be proportional to either the logarithm or 
the square root of the incident light intensity, depending on whether the 
photodiode current operates the NMOS in the subthreshold region or the 
above-threshold region, respectively. Since the goal is to detect a relative 
change in brightness, the circuit is designed to be sensitive to the same 
multiplicative change in photocurrent at any absolute brightness level. Such 
contrast-sensitive photodetectors have also been observed in biological 
visual systems [37]. The logarithmic transfer function of the subthreshold 
transistor translates a multiplicative increase or decrease in photocurrent into 
an additive increase or decrease in output voltage, simplifying the task for 
the next stage of the pixel. The square root function does not share this 
property exactly, but it has a similar curve and approximates a logarithm. 
Fortunately, the pixels of the centroid-tracking imager chip operate in the 
subthreshold region for most light levels. Light intensities of over 
10mW/cm 2 are required to generate over 1 nA of photocurrent, and in 
practice even extremely bright light conditions do not exceed 1 mW/cm 2 at 
the photosensor. The photosensitive voltage is AC-coupled to the rest of the 
pixel through a PMOS capacitor with the well tied to the drain and source. 
The rest of the pixel consists of a resettable comparator circuit (implemented 
using a biased CMOS inverter) and a feedback switch. The inverter includes 
a cascode transistor to enhance gain. 

Operation of the pixel starts with the reset of the comparator block within 
the pixel. The inverter feedback switch is closed, input is made equal to 
output, and the inverter settles at its switching voltage. At this time, the 
difference between the photodiode cathode and inverter input voltages is 
stored across the PMOS capacitor. The PMOS capacitor is held in inversion, 
since the inverter reset voltage is significantly lower than the photodiode 
voltage. When the switch is opened, the inverter goes into open-loop 
operation. As the light level on the photodiode increases, the voltage on its 
cathode will decrease. Since the input to the inverter circuit is floating (high 
impedance), its voltage will now track the voltage on the photodiode, offset 
by the voltage across the capacitor itself. When the voltage decreases by a 
given amount AV corresponding to a given factor increase in photocurrent, 
the inverter will trip and its output will go high. If light on the pixel 
decreases, however, no event will be signaled because the inverter will move 
even farther away from its switching threshold. 
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Figure 5-20. (a) APS row circuitry, (b) Schematic of the centroid-tracking pixel. 



The amount of change in incident light necessary to trigger the pixel after 
reset is released depends on the setting of K pbias and the intensity of light 
incident during reset. The setting of K pb ias will set the reset voltage of the 
inverter. To maximize the gain of the inverter and to save power, 
subthreshold-level currents are used in the inverter. Equation (5.10) follows 
from equating the drain currents of the PMOS and NMOS transistors: 



/ e 



V out K n ( Vdd~^pbias) K P 

u < = I =/ =/ e Ut 

1 DN 1 DP 1 0 P^ 



(5.10) 



where U t is the thermal voltage, k n and k p are subthreshold slope factors, 
and I 0N and I 0P are process-dependent and gate-geometry-dependent factors 
relating fundamental currents of the subthreshold transistors. The equation 
for output reset voltage is thus 



V =—ln 

mvreset 

K N 



f I ^ 

2 0P 

V^O N J 



+ -y ) 

^ y dd v pbias ; 



(5.11) 



Since I 0P is significantly less than I 0 n , it can be seen from this equation 
that Fi nvr eset < V dd ~ ^pbias- The difference between Ti nvrese t and V T of the 
NMOS row and column pull-down transistors determines the initial output 
AV necessary to cause the NMOS pull-down transistors to conduct and 
signal a change: 
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AV =V -V. 

^ v out v TN v invreset 



(5.12) 



Dividing this AV 0Ut by the gain of the inverter yields the AV in necessary to 
sufficiently move the output: 



av = ^Kut_ 

4„v 



(5.13) 



where the gain of the inverter with subthreshold drain current is 
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yielding 

(5.16) 
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or 
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invreset 

kVqp 

The terms V 0N and V 0P are the early voltages of the NMOS and PMOS 
transistors in subthreshold states. Because of the cascode transistor, the g ds of 
transistor M5 no longer makes a significant contribution to the gain, and can 
be neglected in the final expression. Note that AV in is negative due to the 



W, 



(5.17) 
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negative gain of the inverter. This equation describes the basic operation of 
the pixel. More details, including the effect of the coupling of the switching 
voltages, will be presented in section 5.4.6 and later sections. 

The inverter drives two NMOS pull-down transistors that are attached to 
the particular row and column lines associated with the pixel. These lines are 
set up in a wired-OR configuration, with weak PMOS pull-up transistors on 
the edges of the array. Switches can disconnect the array from the edge 
circuitry to avoid current draw during reset. 

5.4.5 Centroid edge circuitry 

The centroids of the activated rows and activated columns are computed 
separately to arrive at a final (x,y) coordinate for the two-dimensional 
centroid. A center-of-mass algorithm is employed, resulting in sub-pixel 
precision. 

The following is a description of the edge circuitry operation specifically 
for the column edge; row edge operation works identically. Figure 5-21 
shows the schematic of this module. The edge of the centroid subsystem 
receives a series of column outputs corresponding to each column of the 
centroid pixel array. Columns containing pixels that have experienced an 
increase in their brightness will show up as a logic low. The center-of-mass 
calculation computes a weighted average of every activated column using 
the column position as weight. For example, if only columns 20 and 21 have 
been activated, the result of the center-of-mass calculation would be 20.5. 
This example also illustrates sub-column position precision. The position 
weights are represented as a set of voltages from a resistive ladder voltage 
divider with as many taps as there are columns. These voltages are buffered 
using simple five-transistor differential amplifiers. A column with a low 
(activated) output will first set an SR flip-flop, locking it high until the flip- 
flop is reset with an externally provided reset signal. The outputs of the SR 
flip-flops turn on weak PMOS transistors operating in the ohmic region, 
which connect the column weight voltages to the centroid output node. The 
PMOS transistors have a width/length ratio of 4/22, and are turned on by 
lowering their gates fully to ground. The weight voltages of all active 
columns are then connected to the common node through the PMOS pseudo- 
resistors, and this network of voltages interconnected through identical 
pseudo-resistors computes the average of all voltages connected. The output 
voltage is thus the center of mass value of all active columns. 

In the ideal case, all activated PMOS resistors would be in the linear 
region so that V ds has a linear relation to the current flowing, approximating 
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Figure 5-21. One segment of the centroid edge circuitry. 

a true resistor. For a PMOS to be operating in the linear region, the condition 
-V ds < -V gs + V T must hold. Equivalently, V ds > V gs - V T . It must be true that 
V gs < 0 - Eiaddermin, where Fiaddermin is the low voltage of the resistive ladder. 
Therefore, the sufficient condition for linear operation can be expressed as 

Kts > — laddermin (5.18) 

The threshold voltage is dependent on the source voltage due to the bulk 
effect. In addition, V ds > V\ MQrmm - Kiaddermax must always be true, because the 
minimum drain voltage possible is the minimum voltage of the resistive 
ladder and the maximum is Fiaddermax, the maximum value of the voltage 
ladder. The values of Kiaddermin that satisfy the inequality 

V —V > — V —V (5 19") 

r laddermin r laddermax r laddermin r T W* y ) 



or 



V -V 

1 / ^ laddermax T /r 

V laddermin ^ ^ 

will cause V ds to absolutely satisfy the condition for operating in the linear 
region. For a typical V T p of-1 V and Fiaddermax of 2.8 V, the low voltage of 
the resistive ladder must therefore be 1.95 V to guarantee that all PMOS 
transistors will operate in the linear region. 
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In the normal mode of operation, the low voltage of the resistive ladder 
used in the centroid-tracking imager chip is 1.8 V, and the high ladder 
voltage is 2.8 V. In the worst case, it is possible that a PMOS transistor 
sometimes will not be operating in the linear region, and hence will 
dominate the averaging operation due to its higher conductance. In practice, 
however, moving objects are localized. As long as there is only a single 
moving object in the scene, the activated rows and columns will be in close 
proximity to one another. Hence, the V& voltages between the reference 
voltages and their average will stay small enough to keep each PMOS 
operating in the linear region. 

It should be noted that each pixel position factors into the center of mass 
calculation with equal weight. Because the region of interest is defined as 
everywhere that pixel light intensity has changed, it is necessary to assume 
that every point has a weight of 0 or 1. It is possible to imagine other 
functions, such as one that weights each pixel by how much its light 
intensity has changed. However, it is unclear whether this is a desirable 
metric. Therefore, it is assumed for this system that the only meaningful 
algorithm is a binary condition: change or no change. 

In addition, this circuit does not consider the number of pixels activated in 
a column or row. It gives every column or row the same weight independent 
of the number of activated pixels. Instead of noting the actual centroid of the 
pixels that are activated, it detects the centroid of a rectangular box 
coincident with the edges of the region of activated pixels. This was chiefly 
an implementation-related optimization. It is much easier for the edge 
circuitry to note activity/non-activity than to include how much activity a 
certain row or column contains. For most objects, the centroid of a 
coincident rectangular box is a good approximation of their true centroid. 
The main drawback of this modified centroid is that single outlying pixels 
are given as much weight as those that are clustered together. Thus, false 
activity registering on one pixel gives the row and column of that pixel the 
same weight in centroid calculations as the rows and columns that contain 
many pixels responding to the image of the real object. This is a regrettable 
disadvantage, but it can be justified by the much-simplified implementation 
of the current scheme. 

5.4.6 APS analysis 

The APS subsystem is composed of two main units: the APS pixel and 
the edge circuitry that both controls the pixels and helps to process output 
analog data. 
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The APS pixel used here is a well-proven 3T design (reset, amplifier, and 
readout switch transistors), which has been well analyzed in other papers 
[16,38]. The main characteristics of this specific implementation are 
summarized below. Figure 5-18 shows a schematic of the APS pixel and 
Figure 5-19 shows the column readout circuit. 

5.4.7 APS pixel linearity, gain, and sensitivity 

The gain of the pixel from incident light to output voltage is a function of 
only a few circuit elements. The first is the integrating capacitance of 94.2 fF 
on the photodiode node. This sets the conversion gain of the photodiode at 
1.70 pV/e“. Following the input capacitance is the gain of the pixel gate- 
source amplifier. The output impedance of the column current sources and 
g mb set this gain at 0.77. The switched capacitor of the CDS circuit ideally 
performs subtraction of voltages with a gain of one. Leakage currents and 
coupling in the switches will introduce error, but because this is not a gain 
error, its gain can be assumed to be one for the purposes of this analysis. 
Following the switched capacitor is a two-stage opamp connected in unity- 
gain configuration. As such, its actual gain is more like Al(\+A), where A is 
the gain of the opamp. For the design considered here, A is around 15,000, 
which makes the gain of the buffer configuration virtually unity. 

At this point, the total gain of the system is 1.31 pV/e“. To translate this 
into a useful figure, it is necessary to convert the units of e” to units of 
(pW/cm 2 )*s by assuming a quantum efficiency of 20% and a wavelength of 
600 nm, and by noting that the photodiode area is 30.87 pm 2 . The resulting 
gain, equating voltage to light intensity and integration time, is 
244 pV/((pW/cm 2 )*ms). 

5.4.8 APS noise 

The noise of the APS system begins with the photodiode itself. Photon 
shot noise and dark current shot noise are described as follows: 

/ 2 \ ^photo^^reset /r- o i \ 

(Vo.cn )=- E ^2 ^ (5-21) 

'""'pdiode 



(vL)= /( g 2 A/resel g (5-22) 

pdiode 
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With /photo + /dark ~ 2 pA, A/ eset = 926 ps, and the capacitance of the 
photodiode node at 94.2 fF, the total noise from current through the 
photodiode comes to about 33.4 x 10” 9 V 2 , or 183 pV rms . 

Reset noise is calculated to be 210 pV rms from the following basic 
equation: 

(Vpixreset) = 7^ (5-23) 

pdiode 

This noise figure is only appropriate for reset times that are long enough 
for the photodiode to reach a steady state during reset. The usual mode of 
operation for the centroid pixel involves a reset as long as a full-row readout 
time (926 ps), which is long enough for the pixel to reach steady state reset 
at moderately high light levels. However, for lower light levels, the non- 
steady-state noise energy relation should hold: 

/ \ kT 

( V pixrese.2/*^ ( 5 - 24 ) 

^“"pdiode 

This corresponds to a voltage of 148 pV rms . This thermal noise figure gives 
the fundamental noise floor of the images regardless of matching. 

Noise is also associated with the output follower pixel amplifier and the 
input to the CDS circuit. During the clamping of the CDS capacitor, the 
noise can be modeled as kT/C noise with the CDS capacitor at 100 fF. This 
noise amounts to 

/ \ kT 

( V cdsdamp) = 7^ (5.25) 

^ CDS 

or 41.43 x 10“ 9 V 2 . 

After the output side of the CDS clamp is unclamped, the total noise 
power becomes the sum of the noise contributions from the pixel follower 
amplifier and the column bias transistor. The two noise sources present are 
1 If and transistor shot noise. The noise in the currents of these transistors 
includes (z 2 ^), contribution of the thermal noise of each transistor, and (z 2 /), 
the contribution of the 1 If noise for each transistor, where 
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and 



(5.26) 






fc 0 Aff 



(5.27) 



The total noise current flowing in the column line is 



*col ) — ythMl ) + ythM2 



•2 \ / . 2 \ 

+ ( l fM\ ) + yfM2 / 



(5.28) 



and the resulting noise (in energy units of V 2 ) to the input of the non- 
clamped CDS circuit is 




(5.29) 



The noise contribution of the follower amplifier is denoted by (v 2 amp ), 
and is calculated to be 185.54 x 10” 9 V 2 for the buffer amplifiers of the 
centroid-tracking imager, corresponding to 431 pVnns [39]. 

In addition to all of these fundamental noise sources [40], there are also 
the unwanted variations in processing in the pixels that are collectively 
named fixed-pattern noise (FPN). The dominant phenomenon of FPN is the 
random variation in the threshold voltage of the reset and pixel amplifier 
transistors. The CDS circuit should eliminate the effects of this variation, 
and should eliminate the 1 If noise sources in the circuit as well. However, 
the reset is sampled after the signal, and the two voltages are not part of the 
same integration cycle; thus the kT/C noise from the sampled reset is not 
correlated to the noise from the integration cycle. This means the kT/C noise 
is not eliminated, and must still be included in the total noise figure. Indeed, 
it must be counted twice because the reset noise power from the actual 
integration cycle and from the sampled reset will both add to the total noise. 
The noise contributions of the column with CDS cleanup of the 1 If noise has 
been labeled here as ( v 2 C 0 i C ds) ? where 
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Since the noise remaining after CDS contains only thermal components, 
it can be reduced to a kT/C term. This leads to a final noise expression, after 
CDS, of 
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Adding all of these noise contributions together gives 

( V apstotal ) = 304.3 X 1 (T 9 V 2 



(5.32) 



for a final noise amplitude of 552 pVnns, neglecting 1 If noise. 

5.4.9 APS dynamic range and SNR 

The output voltage range of the APS system is originally limited by the 
maximum reset voltage in the pixel, minus the lowest voltage for reliable 
photodiode operation. 

In this case, reset is approximately V dd - V T n, or 3.3 V - 1.0 V = 2.3 V for 
an NMOS transistor (including the bulk effect). The reset transistor will still 
be supplying current to the photodiode during the reset cycle. Exactly how 
much current the photodiode draws will be determined by the light intensity 
falling on the pixel at the time of reset, and the final voltage of the pixel will 
reflect this. Since these factors can and do vary during the operation of the 
imager, the real reset voltage also varies. Part of the function of the CDS 
circuit is to compensate for this normal variance of the reset signal. The 
follower in the pixel causes the column voltage to drop by a further amount 
equal to V TN , which together with the bulk effect reduces the maximum 
(reset) voltage on the column to 1.3 V. 
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The minimum voltage of the column is dictated by the column current 
sources: they must stay in saturation. For this to be true, the column voltage 
cannot drop to less than V ds > V g ~ Vt, or 



Eg l 

\K' n W 



(5.33) 



where I D is the saturated bias current for the column. For this chip and for 
the bias of 260 nA used in these columns, it can be calculated that V ds must 
be greater than 85 mV to stay in saturation. This gives a practical minimum 
voltage of 100 mV. The output range is therefore about 1.2 V. 

With an output range of 1.2 V and a noise level of 455 juV rms , the signal 
to noise ratio is 68 dB. 

5.4.10 APS speed 

The design goal for imaging frame rate was 30 fps. The APS subsystem 
easily meets this specification for speed. Faster frame rates are possible, but 
there is a direct trade-off between exposure time and frame rate, with faster 
rates requiring higher light levels. The absolute limit on speed is governed 
by the column current sources that bias the source followers in the pixels 
during operation. These current sources are normally biased to around 
260 nA for low-power operation. This current drive, combined with the 
column line capacitance of 200 fF, gives a maximum fall time of 1.3 V/ps. 
Therefore, the worst case settling time for one column is about 925 ns with a 
1.2 V voltage range. The settling time for each CDS amp to be switched onto 
the pixel bus is 20 ns. Thus, the columns in a row take 925 ns to settle, and 
each pixel clocked out takes 20 ns to settle. In this imager with 36 rows and 
120 columns, the maximum frame rate is estimated to be 8300 fps if the 
exposure problems associated with short integration times are ignored. 

In practice, the minimum frame rate is set by the desired SNR of the 
imager and the light level to be imaged. The maximum integration time per 
frame is 35/36 of the frame duration. Hence, the formula for the frame rate is 

< (light level) (light to voltage gain) 
fame - (,S7VZ?)(system noise) 
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Using values computed earlier in this chapter, this becomes 



(light level) 244 



•el)[ 244 

^ (pW/cm )-ms 

(SNR){455 pV) 



5.4.11 Correlated double sampling analysis 

The CDS circuitry is essentially very simple: a capacitor, a switch, and a 
buffer opamp. Leakage of the floating capacitor node is the biggest potential 
problem to be faced. The severity of the effects of such leakage on the 
output signal will be estimated in this section. 

Given the known dark current of the APS pixel, it is estimated that 
leakage from the drain diffusion of the clamping switch is 20.25 aA. It is 
also known that the CDS series capacitor has a value of 100 fF. These data 
allow the calculation of the voltage decay rate due to leakage: 



AV _ 20.25 aA 
At ~ lOOfF 



203pV/ps 



(5.36) 



For a 33 ms frame, the 925 ps row readout time will cause this voltage to 
decay by 188 nV. This figure is far below the noise level of the APS system 
and can safely be ignored. Only a frame rate roughly 200 times slower than 
this (about 1 frame every 6.6 s) would cause this leakage to be significant 
compared to the noise of the system. 

5.4.12 APS power consumption 

The power consumption of the whole subsystem is the sum of the power 
requirements for the digital row circuitry, the pixel reset current, the pixel 
amplifier output current, the CDS circuitry, and finally the digital shift 
registers for outputting each pixel. Assuming a normal photocurrent of 
2 pA/pixel, which is observed under ordinary indoor lighting conditions, the 
total current of the APS subsystem is estimated to be 493 pA. The data in 
Table 5-4 shows that the dominant current draw is the due to the biases of 
the CDS buffer amplifiers. 
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Table 5-4. Estimated current consumption of APS circuits. 



Circuit 


Current Consumption (pA) 


Column biases 


31.2 


Photocurrent (2 pA/pixel) 


8.64 


Column (CDS) buffer amplifiers 


417 


Digital row circuitry 


16.4 


Digital column readout circuitry 


20.0 



5.4.13 Analysis of the centroid-tracking system 

The analysis is again divided into two parts, that for the pixel and that for 
the edge circuitry. For the centroid-tracking pixel, sensitivity to light change 
and to noise will be analyzed. In addition, the linearity of the output circuit 
computation and the general system characteristics will be examined. 

Figure 5-20 shows a schematic of the centroid-tracking pixel. Figure 5- 
21 shows one cell of the edge circuitry. 

5.4.14 Centroid pixel sensitivity 

From section 5.4.4, the equation for the input voltage change necessary 
to trigger the centroid-calculating pixel is 

(5.37) 



or 



V invreset 
kVqp 

In addition to the AV in necessary to raise the output of the inverter from 
its reset state to V T m the coupling of the reset switch and dummy 
compensation switch must also be considered. This will add a AK swit ch 
voltage to the input that will significantly affect AV in . While the reset switch 
is being turned off, it still has some finite resistance. In addition, the output 
of the inverter remains a low-impedance restoring voltage, which can sink 
enough current to offset the charge from the gate of the switch as it turns off. 
Therefore, most of the effect of the clock feed-through will occur as the 



W, 



(5.38) 
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NMOS switch gate voltage goes below V T n- In this region of weak inversion, 
most of the charge has already been depleted from the channel. The 
remaining clock feed-through effect resides in the gate-drain overlap 
capacitance. This capacitance for a minimum-size transistor is about 0.5 fF. 
Including bulk effects, the gate voltage at which the switch will turn off will 
be around 1.6 V. The charge injected by a voltage swing of 1.6 V into 0.5 fF 
is only about 0.8 fC. This amount of charge can be removed by the tiny 
subthreshold currents running through the switch before it is completely off. 
These currents can easily be on the order of 1 nA, even at very low gate 
voltages. A current of 1 nA would remove 0.8 fC in 0.8 ps. As long as the 
fall time of the clock signal is even slightly slow, the effect of clock feed- 
through will be reduced by these subthreshold currents. The charge that does 
feed through will arrive at the main 150 fF capacitor of the pixel. In addition, 
the dummy switch transistor operating in full inversion will take in more 
charge with its complementary clock than the main switch manages to 
release. It will couple into the input node with both drain and source overlap 
capacitances and with the gate-channel capacitance while above threshold. 
The combined effect of both transistors, conservatively assuming that the 
main switch does not conduct very well while it is in subthreshold, is as 
follows: 



AV =-V 

v switch v TN 



f c ^ 

^ gdl 

V ^ca py 



+ 



Cgrf2 + Cg,2 ^ 
v C accap , 



+(y dd -v m ) 



r c ^ 

Zgc 2 
^accap J 



(5.39) 



It should be noted that the “high” gate voltage of the dummy switch need 
not be Vdd as it is in this design. If the value of the logic high voltage sent to 
the gate of the dummy transistor were changed, the expression for AF switC h 
would be different. The value of this variable voltage would be substituted 
wherever V d d appears in the current formula. In this way, one could directly 
control the value of A F switC h with this control voltage. This would in turn 
control the sensitivity of the pixel. 

The completed expression for AF sw i tC h allows us to write the full 
description for the change in photodiode voltage necessary to trip the 
inverter: 

AC pdiode =A^ n +Ar switch (5.40) 



© M. Clapp and R. Etienne-Cummings, “Dual pixel array for imaging, motion detection and centroid 
tracking,” IEEE Sensors Journal, vol. 2, no. 6, pp. 529-548, Dec. 2002 (partial reprint). 




Focal-Plane Analog Image Processing 



187 



Normally, the biases and voltages are set such that the gain A inv = -1260 
and AV 0Ut = 250 mV. The value of AV in thus becomes 198 pV and AF swit ch is 
computed to be 56.5 mV. The voltage for AK pd i 0 de in this case is therefore 
dominated by AK SW i tC h- 

The photodiode voltage is regulated by V gs of transistor Ml. This V gs is 
directly dependent on the photocurrent of the photodiode. If the light falling 
on this pixel induces a subthreshold pixel current (as it does for almost all 
lighting conditions), then the source voltage of Ml will change as the natural 
logarithm of current change. The current will have to increase by a specific 
multiplicative factor from its value during inverter reset to change V gs by a 
sufficient amount. To change the source voltage by a specific AFp diode , the 
current will need to reach / trip as described in the following equation: 



Arip ^AightAeset 



where 



M ii g h, = e x P 



f a V ^ 

^ V pdiode 

u f 



(5.41) 



(5.42) 



From the values for AK pdiode given above, Might = 9.6. 

5.4.15 Centroid pixel noise 

Since the output of each centroid-tracking pixel is a digital voltage, noise 
in the output voltage is not a concern. However, noise can degrade the 
operation of the pixel by introducing a random voltage component into AV in , 
and hence in the factor of the increase in light level necessary to activate the 
pixel. In this section, the noise-induced uncertainty will be calculated. 

The first noise source arises from the process of resetting the inverter. 
During reset, the output node of the inverter will exhibit kT/C noise due to 
the thermal noise in the inverter transistors and in the main pixel capacitor. 
The main pixel capacitor has a capacitance of 150 fF, so this reset noise will 
be ( v 2 rese t) = 26.5 x 10“ 9 V 2 for an amplitude of 162 pV rms . When the reset is 
released, the input will start with this level of noise. 

The photodiode/NMOS bias subcircuit is easy to analyze for noise, 
because it is the same as an APS photodiode in perpetual reset. As such, it 
also exhibits kT/C noise. During normal (non-reset) operation, however, the 
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main explicit capacitor in the pixel is floating and will not contribute to this 
noise figure. The photodiode parasitic capacitance on this node totals 82 fF. 
This gives ( v 2 pd i 0 ide) = 50.5 x 10“ 9 V 2 and a noise amplitude of 225 pV rms . 

These two noise sources together give a noise power to the input of the 
inverter of 

(vL,> + (v^> = 77x10- 9 V 2 (5.43) 

or an RMS noise voltage of 277.5 pV. This much voltage change on the 
input of the inverter corresponds to an extra factor of 1.01 of photocurrent 
and hence light intensity. Compared to the threshold for tripping the inverter 
of Might = 9.58, this is a small amount of noise — clearly not enough to cause 
accidental activation of the inverter. 

5.4.16 Centroid subsystem speed 

The limiting speed of centroid operation is dependent on the reset time 
and the propagation delay of the pixel inverter. There is a certain minimum 
time that the inverters of the pixels need during reset to settle to their final 
trip point. 

Each inverter during reset can be modeled as a PMOS current source, a 
diode-connected NMOS transistor, and a main pixel capacitor to AC ground. 
The diode-connected NMOS can be approximated as a resistor with a value 
of 1 /gms, with g m5 calculated at the operating point where V out has reached its 
final equilibrium voltage. This will not be a true representation of the circuit 
operation, since an accurate analysis would require a large signal model. 
However, the approximation of l/g m s at the equilibrium point will give 
results that are always more conservative than the true behavior. If V out starts 
high and must fall to reach equilibrium, the actual g m5 will be larger than the 
equilibrium g m5 and the circuit will reach equilibrium faster. If V out starts 
lower than equilibrium, the actual g m5 will be lower, and again the circuit 
will reach equilibrium faster in reality than the approximation would 
indicate. 

Figure 5-2 2 (a) shows the simplified inverter circuit; Figure 5 -22(b) is 
the equivalent small-signal circuit. In the small-signal model, the inverter 
becomes a simple RC circuit with r= C accap /g m5 , where g m5 = I d (k n IU) for 
the circuit in a subthreshold state. At equilibrium, I D = 68.82 pA, and 
therefore g m5 = 2.27 x 10“ 9 mho and 7? = 439.8 MQ. With C accap = 150 fF, 
this gives a time constant of r= 66.0 ps. However, (2.2) r, or 145 ps, would 
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/\Vdd 




(a) (b) 

Figure 5-22. (a) Simplified centroid pixel inverter in reset, (b) Schematic diagram of an 
equivalent small-signal inverter in reset. 



be a more conservative time constant, with the voltage moving 90% of its 
total swing in that time. The actual reset time constant will be shorter, but 
145 jus will be a good conservative minimum. 

The propagation delay of the inverter is directly dependent on the bias of 
M3. The parasitic capacitance on the output of the inverter is roughly 
12.3 fF. This yields a propagation delay of 



CV TN _ (12.3fF)(0.75 V) 



L bias 



bias 



(5.44) 



For C = 68.82 pA, ^verier =134 ps. Summing this with the minimum reset 
time gives a total minimum cycle time of 279 ps and a maximum centroid 
rate of 3580 Hz. It should be noted that increasing the inverter bias current 
by a factor of 10 will decrease the inverter propagation time by a factor of 
10, but will increase the centroid system current consumption by only about 
1.5%. 

The time after reset and before the tripping of the inverter is spent waiting 
for the photocurrent to change by a sufficient magnitude. The length of this 
period of time, detect, determines the minimum rate of change of light levels 
in order to be detected: 



n _ ^light v f 

^change ^ ' ^detect 



M, 



r 



detect 



R, 



light 

change 



(5.45) 
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Table 5-5. Estimated current consumption of centroid-calculating circuits. 



Circuit 


Current Consumption 


Photocurrent (6 pA/pixel) 


12.96 nA 


Digital circuitry (180 Hz) 


305 pA 


Pixel inverters (68.8 pA/pixel) 


148.7 nA 


Resistive ladder diffamps (1.2 pA/A) 


115.2 pA 


Resistive ladder 


4 pA 



For a desired maximum Change, equation (5.46) computes the minimum 
^detect part of the centroid cycle. The final cycle time takes 



^ cycle ^reset ^detect ^inverter 



(5.46) 



Conversely, the longest one can wait without the inverter falsely tripping 
determines the maximum period possible with this centroid circuit. Leakage 
from the switch drain and source diffusions limits the amount of time the 
input of the inverter can remain floating. The maximum time before the 
leakage current alone causes the inverter to trip can be predicted given that 
(i) AFi n = 198 pV (from section 5.4.14), (ii) the leakage current from the 
three drain/source diffusions is 60.75 aA, and (iii) the capacitance of the 
input to the inverter is 150 fF. In the absence of light, this should happen at 
A^ieak = 489 ms. 



5.4.17 Centroid power consumption 

The power consumption of the centroid-tracking circuitry depends on the 
photocurrent drawn by the continuously biased photodiode, the operation of 
the pixel inverter, and the digital and analog circuitry on the periphery of the 
chip. 

Photocurrent can easily vary by decades depending on the intensity of the 
incident light. A photocurrent of 6 pA may be assumed here, since it has 
actually been observed with this chip under indoor lighting conditions. 
Given this level of incident light, the continuously biased pixels use about 
13 nA over the whole array. The total current of this block has been 
estimated to be 116 pA, of which the largest share goes to the buffer 
diffamps on the resistive ladder of the edge circuitry. Table 5-5 shows the 
computed current draw of each portion of the centroid circuitry. 
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Figure 5-23. Output pixel voltage as a function Figure 5-24. Temporal noise at the output 
of input light level for a 33 ms integration time. of the chip. 



5.4.18 Measured APS linearity, gain, and sensitivity 

A graph of pixel voltage output as a function of light intensity input is 
shown in Figure 5-23. The points corresponding to the input light intensities 
between 11 pW/cm 2 and 121 pW/cm 2 were fitted to the straight line shown 
in the figure. Within this range, the slope was 8.27 mV/(pW/cm 2 ). For an 
integration time of 33 ms, this corresponds to 250 pV/((pW/cm 2 )-ms), which 
differs by only 3.3% from the estimate of 244 pV/((pW/cm 2 )*ms) in 
section 5.4.7. 

The voltage range corresponding to this fit was 2.19 V to 3.09 V, for a 
linear range of 0.9 V. Within this range, the RMS error of the voltage values 
from the fitted line was 2.3 mV. 

5.4.19 Measured APS noise 

For the APS imager, all measured noise exceeded the predictions from 
the analysis. This is understandable considering that it was necessary to omit 
1 If noise from the computed noise value. 

Temporal noise in the APS imager was measured over time at the same 
pixel location, and was extremely low compared to other noise sources. All 
measurements indicated a temporal noise of less than 1.8mV rms over the 
linear signal range (see Figure 5-24). The APS system noise predicted in 
section 5.4.8 was 552 pV rms , which corresponds to the minimum value in 
Figure 5-24. This measured noise was so low that it likely approached the 
minimum levels that our board readout electronics were capable of detecting 
accurately. 
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Pixel-to-pixel fixed-pattern noise for the imager was barely detectable at 
low light levels, but increased at an almost linear rate as the signal level 
increased. The large-signal equation for a gate-source follower with 
transistor Ml as the follower and M2 as the current source is 



K = K, +v n +r, V 2 I^K», - Wf 



k\ 



w 

F . 



l+^„„ 



l + A(V dd -V out ) 

(5.47) 



Given equation (5.47), an expression for the change in output voltage due 
to a change in input voltage (as would be computed by the CDS unit) can be 
derived: 



V m (reset) - V m (final) = F out (reset) - F out (final) -y x B- 
B = V 2 Wf I + F om ( reset ) ~ V 2 Vf I + F out( fmal ) 



K\ 



-D 



D = 



i + ^out ( reset ) 



l + A 2 F 0Ut (final) 



1 + 4 F*,- v 0ut (reset)) f. 1 + A, (V dd - F out (final)) 



(5.48) 



From equation (5.48), it can be seen that the deviation from a direct 
A F in = A Fout relationship involves the bulk effect of the pixel amplifier 
transistor ( B ) and the drain conductance of both pixel and column current 
source transistors ( D ). It can be shown that the factor D increases almost 
linearly with decreasing V out given that X\ = 0.0801 and A 2 = 0.0626, which 
is the case for this APS system. As D increases, it magnifies the contribution 
of the (W/L)i factor from the gate-source follower in the expression. Thus, 
any variation in the gate-source follower transistor geometry in the pixel 
will have an increasing effect as the signal increases and F out decreases. To 
decrease this effect, Ai and A 2 need to be reduced. Lengthening the column 
current source transistor (and also the pixel transistors if space allows) will 
accomplish this. 

Column-to-column FPN results were relatively constant over the linear 
range of the imager. These stayed at or near the worst pixel-to-pixel noise of 
about 16mV rms . There are two chief reasons for this undesirable 
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Column Mean ValuesD 




Figure 5-25. Voltage as a function of light intensity for odd and even columns. 

performance. First of all, this simple architecture used a CDS circuit on the 
end of every column, so there was no way to correct column-to-column 
offsets. A global CDS circuit would alleviate much of the column-to-column 
FPN. 

The second reason has to do with the layout of the pixel array. Because 
two APS pixels are tiled above and below each centroid-mapping pixel in 
the array, the physical and electrical environments of adjacent APS pixels 
are not the same. Put more simply, for every two adjacent APS pixels, one 
will see the left sides of the neighboring centroid pixels and the other will 
see the right sides of the neighboring centroid pixels. Doping profiles for left 
and right APS pixels will be slightly different because of the asymmetric 
placement of the photodiode area within the centroid pixel. This differing 
proximity of the photodiode will also cause the amount of photogenerated 
carriers in the substrate to be different for left and right APS pixels under the 
same incident light. These are unfortunately types of gain errors, and as such 
cannot be remedied by the CDS circuit. As a result, alternating lighter and 
darker vertical stripes are apparent in images of scenes that are dimly and 
evenly lit. This phenomenon is also apparent when measurements are taken 
separately for odd and even column groups. Figure 5-25 clearly shows a 
different light-voltage transfer function for odd columns than for even 
columns. When the average pixel value of every column in the array is 
measured for an evenly lit scene, the distribution is not normal. The entire 
array has instead a binodal distribution, with two separate and distinct mean 
values especially at higher light levels. The column-to-column FPN of all 
odd columns and of all even columns taken separately are each better than 
the combined FPN figure (see Figure 5-26). 
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Figure 5-26. (a) Column-column fixed-pattern noise for all columns together (entire image). 
(b) Column-column fixed-pattern noise for odd and even columns taken separately. 

In future chips, barrier wells could be used to isolate APS pixels from 
centroid pixels and improve performance, but this would be at the expense 
of the area and fill factor of the array. Other solutions would be to make the 
array perfectly regular (with one APS pixel for every centroid pixel) or to 
make the centroid pixels perfectly symmetric. 

5.4.20 Measured APS dynamic range and SNR 

The total noise of the APS imager for different light levels can be seen in 
Figure 5-27. Sample images are shown in Figure 5-27 '(c). At maximum 
signal level, the total noise (standard deviation/mean signal level) of 2.88% 
corresponds to an SNR of 30.8 dB. 

5.4.21 Measured APS speed 

Imaging at a high enough frame rate to test the limits of the APS is 
difficult, due to the high light levels necessary for a useful signal. The 
readout circuitry could be tested for speed, however. These circuits still 
functioned properly up to a pixel clock speed of 1 1 MHz, or a period of 
91 ns. This was also the limit of the test circuitry. 

5.4.22 Centroid frequency response 

In section 5.4.16, it was estimated that the fastest rate of blinking at 
which a stationary blinking light would remain detectable was 3.6 kHz. To 
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Figure 5-27. (a) Total FPN percentage (standard deviation/mean signal level) for different 
light levels, (b) Total FPN (RMS voltage) for different light levels, (c) Sample pictures from 
the APS image array. 

test this calculation, a blinking LED was fed with a square-wave signal of 
sufficient amplitude and variable frequency. The frequency at which the 
blinking actually ceased to be detectable was around 4.2 kHz. 

It was also calculated in the same section that the slowest possible reset 
time would be 489 ms. To confirm this, the array was covered (to protect it 
from light) and left until the pixels tripped through leakage. The actual 
measured time varied between 450 ms and 490 ms in the dark, and was 
about 200 ms in ambient light with no motion. This strongly suggests that 
the drain-source diffusions of the inverter switches are leaking either due to 
indirect light falling on the pixels or due to the effects of minority carriers in 
the substrate from the nearby photodiodes. Such effects could be reduced 
with more careful layout of the switch transistors. 
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Figure 5-28. Position plot of 
the output data from the 
centroid chip superimposed 
on the sum of a series of 
reverse-video APS images. 



Figure 5-29. Reverse-video 
image of one APS frame 
with 6 corresponding 
centroid positions plus 
6 positions from the previous 
APS frame. 



Figure 5-30. Sum of reverse- 
video APS images of a target 
moving in a figure 8 and of a 
stationary LED, both with all 
corresponding centroid 
positions. 



5.4.23 Centroid system performance 

The centroid tracking system was tested using an analog oscilloscope 
screen as a target. The X/Y mode setting was used, and two function 
generators set to 10 Hz and 20 Hz supplied the scope channels. In this way, a 
moving point of light tracing a stable figure-8 pattern could be observed on 
the oscilloscope screen. APS image data and centroid coordinate data were 
taken simultaneously. Centroid voltages were converted to digital data and 
sent to a controlling computer. A composite image of all APS frames was 
produced by summing all frames and then inverting the brightness of the 
image for easier printing. On top of this composite image was plotted the 
centroid positions reported by the centroid-tracking subsystem of the chip. 
The result is displayed in Figure 5-28. The data is an excellent match of the 
target, which was composed of two sine waves in the x andy directions. Six 
centroid coordinates were taken for every APS frame taken. One such APS 
image and the centroid coordinates of the current and previous frames are 
displayed in Figure 5-29. It is obvious that whereas the APS imager sees one 
smear of the path of the oscilloscope point, the centroid-tracking circuitry is 
able to accurately and precisely plot specific points along the path in real 
time. 

Figure 5-30 shows another example of the cumulative centroid positions 
reported for an oscilloscope target. A non-blinking stationary LED was 
placed next to the moving oscilloscope target to demonstrate that the 
stationary LED had no effect on centroid positions despite it being much 
brighter than the oscilloscope. 
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Figure 5-31. Two-dimensional histograms of centroid response with a target of three LEDs. 
Circles indicate blinking LED positions and squares indicate steadily on LED positions: 

(a) 3 blinking; (b) 2 blinking, 1 steadily on; and (c) 1 blinking, 2 steadily on. 

With faster moving targets, the speed of the centroid subsystem could be 
increased even more. Centroid pixels are sensitive to changes in incident 
light since their last reset. Therefore, faster changes in light (faster 
movement) would allow for shorter reset intervals and higher measurement 
frequency. 

In addition to trials involving a single moving target, experiments with 
multiple targets were performed. In the first set of experiments, a target of 
three LEDs in a triangle formation was imaged. All the LEDs were either 
blinking or steadily on, and all were stationary. Three different tests were 
performed. The first test had all three LEDs blinking at exactly the same 
time. Figure 5-3 1(a) shows a histogram of the centroid positions reported by 
the chip, with blinking LED positions marked by circles. From this 
histogram, we can see that the vast majority of positions reported are in the 
center of the triangle. Notice that since two LED positions are on nearly the 
same row, their contributions to the row position of the centroid are 
overlapping. Since the method for centroid determination ignores the 
number of active pixels in a row, the computed centroid is closer to the far 
point of the LED triangle than would be expected from a true centroid. The 
weight of the far point (row 23) in the centroid computation is comparable to 
both LEDs together on row 10 of the graph. The second experiment was the 
same as the first, except that one LED was continuously on instead of 
blinking. In Figure 5-31(b), the non-blinking LED location is marked with a 
square outline instead of a circle. The positions plotted lie between the two 
blinking LED positions and are unaffected by the steadily on LED. 
Similarly, Figure 5-3 1(c) shows a test with one blinking LED position 
(marked with a circular outline) and two non-blinking steadily on LEDs 
(marked with square outlines). In this case, there is no doubt that the only 
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Figure 5-32. Two-dimensional histogram overlaid on the imager array, showing reported 
centroid positions for two LEDs with separate blinking phases and periods. 

positions reported are at the only element of the scene that is changing in 
time. 

Another experiment was performed that involved multiple LEDs with 
uncorrelated blinking. Two LEDs with separate blinking periods and phases 
were set up at different x and y positions in front of the imager, and centroid 
positions were recorded. Figure 5-32 shows a histogram of the number of 
values recorded in specific regions of the array. In addition to the two 
positions of the actual LEDs showing a marked response, the linear 
combination of their positions also shows a considerable number of recorded 
coordinates. If two LEDs are seen to blink in the same period of time detect, 
the centroid of their positions will be computed and reported. This is the 
normal operation of the centroid subsystem. Multiple target tracking is still 
possible, however, with the addition of some basic statistical analysis of the 
positions reported. Through techniques such as Singular Value 
Decomposition (SVD), the linearly independent positions can be extracted 
and the linear combination of the two positions can be recognized as a false 
position. These techniques have more limitations in their applicability. For 
instance, if the true movement of an object happened to coincide with the 
linear combination of the movement of two other objects, it might be falsely 
omitted. For simple observations of a few objects, however, it is possible to 
extract meaningful position data for all objects involved. A system with 
broader applicability could be constructed by changing the edge circuitry of 
the centroid subsystem, allowing the detection of multiple regions of activity 
in the array. This hardware modification will be pursued in the next- 
generation imager. 
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5.5 Conclusions 

This chapter has presented current-mode, voltage-mode, and mixed-mode 
focal-pane image processing approaches. The current-mode approach readily 
allows the outputs of pixels to be scaled and summed, and thus very compact 
and fast spatial convolutions can be realized on the focal plane. Linearity 
and fixed-pattern noise are, however, usually worse in current-mode imagers 
than in voltage-mode imagers. For temporal filtering and processing, 
voltage-mode imaging is better because correlated double sampling 
immediately provides a vital component of most temporal processors: the 
temporal differencer. Hence, a pipeline method was developed for 
implementing an image temporal differencing scheme that equalizes the 
delay between the two images for all pixels. This approach also allowed fine 
control on the magnitude of the delay, and can be used to increase the frame 
access rate. Both of these systems used a computation-on-readout (COR) 
architecture, which involves a block-parallel, sequential processing of the 
pixels. Lastly, a mixed-mode imager was described that used both voltage- 
mode imaging and processing (for low-noise, high-pixel-density imaging) 
and current-mode imaging (for motion detection and centroid localization). 
This mixed-mode imager also combined pixel-serial and pixel-parallel 
processing. The impact of this combination is an increase in FPN, because 
the environment seen by some pixels is different from that seen by others. 
The phototransduction gain is also negatively affected in the mixed-mode 
imager. Nonetheless, all three modes can be effectively used for low-power 
and small-footprint focal-plane image processing architectures. 
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Abstract: Stochastic adaptive algorithms are investigated for on-line correction of spatial 

non-uniformity in random-access addressable imaging systems. The adaptive 
architecture is implemented in analog VLSI, and is integrated with the photo 
sensors on the focal plane. Random sequences of address locations selected 
with controlled statistics are used to adaptively equalize the intensity 
distribution at variable spatial scales. Through a logarithmic transformation of 
system variables, adaptive gain correction is achieved through offset 
correction in the logarithmic domain. This idea is particularly attractive for 
compact implementation using translinear floating-gate MOS circuits. 
Furthermore, the same architecture and random addressing provide for 
oversampled binary encoding of the image resulting in an equalized intensity 
histogram. The techniques apply to a variety of solid-state imagers, such as 
artificial retinas, active pixel sensors and IR sensor arrays. Experimental 
results confirm gain correction and histogram equalization in a 64 x 64 pixel 
adaptive array integrated on a 2.2 mm x 2.25 mm chip in 1.5 pm CMOS 
technology. 

Key words: On-line correction, non-uniformity correction, adaptation, equalization, 

floating-gate, focal plane, CMOS imager, analog VLSI. 

6.1 Introduction 



Since the seminal work by Carver Mead on neuromorphic floating-gate 
adaptation in the silicon retina [1], few groups have addressed the problem 
of on-line adaptive correction of non-uniformities on the focal plane in solid- 
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state image sensor arrays [2, 3] and neuromorphic vision sensors [4, 5]. Most 
efforts have concentrated instead on non-adaptive correction using on-chip 
[6] or off-chip calibrated storage. Gain and offset non-uniformities in the 
photo sensors and active elements on the focal plane contribute “salt-and- 
pepper” fixed-pattern noise at the received image, which limits the 
resolution and sensitivity of imaging systems and image postprocessing. 
Flicker noise and other physical sources of fluctuation and mismatch make it 
necessary to correct for all these non-uniformities on-line, which is 
problematic since the image received is itself unknown. Existing “blind” 
adaptive algorithms for on-line correction are complex and the amount of 
computation required to implement them is generally excessive. Integration 
on the focal plane would incur a significant increase in active pixel size, a 
decrease in spatial resolution, a decrease in fill factor of the imager, and an 
increase in power consumption. 

In this work, a class of stochastic adaptive algorithms has been 
developed that integrate general non-uniformity correction with minimal, if 
not zero, overhead in the number of active components on the focal plane. In 
particular, floating-gate adaptive CMOS technology is used to implement a 
two-transistor adaptive-gain element for on-line focal-plane compensation of 
current gain mismatch. The algorithms make effective use of the statistics of 
pixel intensity under randomly selected sequences of address locations, and 
avoid the need for extra circuitry to explicitly compute spatial averages and 
locally difference the result. The resulting stochastic algorithms are 
particularly simple to implement. 

The stochastic algorithms for adaptive non-uniformity correction that 
take advantage of the spatial statistics of image intensity can also be used to 
perform image intensity equalization and normalization on the focal plane. 
Equalization is a useful property because it maximizes the available dynamic 
range and assigns higher sensitivity to more statistically frequent intensities. 
At the same time, the image is converted into digital form, thus avoiding the 
need for explicit analog-to-digital conversion. 

In this chapter, the stochastic algorithms for adaptive non-uniformity 
correction are formulated. A simple logarithmic transform on offset 
correction allows the use of the same algorithms for gain correction. 
Intensity equalization is discussed as a natural extension of these stochastic 
rules. The floating-gate translinear current-mode VLSI implementation of 
the adaptive pixel is described and analyzed. The system architecture is also 
described, including the external circuits used for experimental validation of 
the VLSI imager. Experimental results for gain non-uniformity correction 
and image intensity equalization are discussed. 
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Non-uniformity correction can be approached using two strategies: apply 
a uniform reference image to the static imager and ensure that all pixel 
outputs are equal [1], or drift natural scenes across the imager where each 
pixel subtracts its output from its spatially low-pass filtered output to derive 
an error signal [7]. The former is referred to as static non-uniformity 
correction (SNUC) and the latter as scene-based non-uniformity correction 
(SBNUC). Our imager can accommodate either type of mismatch correction 
strategy. The SBNUC algorithm has been implemented on the focal plane in 
CMOS- and IR-based imagers [2], and has been successful in reducing offset 
mismatch. 

In this chapter, SNUC will be used as the primary method to reduce 
current gain mismatch in a phototransistor-based CMOS imager or silicon 
retina. An adjustable, adaptive pixel current gain can be achieved by 
applying a controllable voltage offset on a floating-gate transistor in each 
pixel. The system architecture also allows SBNUC through control of the 
statistics of random address sequences. 

First, the problem must be set up in terms of established on-line 
algorithms for offset correction. Then this same algorithm can be extended 
to gain mismatch reduction through a simple logarithmic transformation of 
system state variables. 

Figure 6-1 (a) schematically demonstrates the offset correction technique. 
The set of system equations is 

y = x + o , z = y + q = x + o + q , (6.1) 

where x is the input random (sensor) variable, y is the received input with 
unknown offset o , q is the applied offset correction, and z is the corrected 
output. For offset cancellation, 

o + q = constant V pixels. (6.2) 

A simple (gradient descent) adaptive rule [7] to achieve this is 

A q = -a[z - z ref ], (6.3) 



which adjusts the output z on average towards a reference z re /. 
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Figure 6-1. (a) Offset correction, (b) Gain correction. © 2001 IEEE. 
The reference is constructed by expressing 



f(z) for SNUC 
W = {{*),«., forSBNUC 



(6.4) 



where the ( ) symbol represents spatial averaging at global and local scales, 
respectively, and a denotes the adaptation (or “learning”) rate. Circuits 
implementing a locally differenced diffusive kernel (with adjustable space 
constant) to perform the computations in equation (6.3) are presented in [2]. 
One can introduce a stochastic version of this rule 



~ CC [ Z r{k) Z r(£-1)) (6-5) 

where the subscripts r(k- 1) and r(£) denote pixel addresses at consecutive 
time steps (£-1) and k respectively. Taking expectations on both sides of 
equation (6.5) (for a particular pixel selected at time k) yields 

E [^r (k) ] = ~ a ( Z r (k) ~ E [ Z r(k-l) ]) 

which depends on the statistics of the consecutive address selections as 
determined by the conditional transition probabilities (densities) given by 
p(r(k- 1) | r(£)). Therefore, by controlling the statistics through proper 
choice of the random sequence of addresses r(£) [i.e.,/?(r(£- 1) | r (£))], one 
can implement, on average, the spatial convolution kernels needed for both 
SNUC and SBNUC in equation (6.4). In particular, for a random sequence 
with the terms r(k- 1) and r(£) independent [i.e., p(r(k- 1) | r(£)) = 

p(r(k-m. 
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E [ Z r (it -l)] = ( Z ) ( 6 - 6 ) 

whereas if r(k- 1) and r(£) are related by embedding memory in the address 
sequence (for example, through inertia or by imposing limits on Ar = r(£) - 
r<A D). 

E [*,(*-!) } = ( Z ),ocar ( 6 ' 7 ) 

Equation (6.5) is a stochastic on-line version of SNUC, and likewise 
equation (6.7) implements stochastic SBNUC. Hardware requirements can 
be further simplified by thresholding the update in equation (6.5) to a signed 
version of that rule (a “pilot” rule), 

A 4r (*) = -«sign(z rW -z r(t _i)) (6.8) 

with fixed-size update increments and decrements. 

6.3 Canceling gain non-uniformity 

The gradient descent formulation [7] also adaptively compensates for 
gain mismatch, although it does not prevent the gain from becoming 
negative. The approach in this chapter is to relate gain correction (under the 
positivity constraint imposed by current-domain circuits) to offset correction 
through a logarithmic transformation. This transformation has a physical 
meaning that can be exploited in the hardware implementation as discussed 
in the next section. Figure 6- 1(b) schematically illustrates the concept of 
gain mismatch correction in relation to Figure 6-l(a). 

The system is governed by the equations 

y =ax\ z' = Ay' = Aax, (6.9) 

which can be transformed into 

lnz = \nA + \na + \nx , (6.10) 

such that for gain non-uniformity correction, 

In A + In a = constant V pixels. (6. 1 1) 
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By identifying corresponding terms (in particular, In A = q (or A = e q ) 
and In a = o) in equations (6.1) and (6.10), and because of the monotonicity 
of the logarithmic map, the learning rule of equation (6.8) can be rewritten 
as 



(*)=-« sign 




i \ 

Z r(k-l) ’ 



( 6 . 12 ) 



which in turn can be expressed as a stochastic on-line learning rule with 
relative gain increments: 



H(i> = ~ aA r ( k) si g n 



Z r(k) Z r(k-l) 



(6.13) 



6.4 Intensity equalization 

The corrections in the constant terms both in the offset equation (6.2) and 
in the gain equation (6.11) are undefined and not regulated during the 
adaptation. This problem can be circumvented by properly normalizing the 
acquired image. One particularly attractive approach to normalization is to 
equalize the image intensity histogram, which in addition to mapping the 
intensity range to unity also produces a maximum entropy coded output [8]. 
Incidentally, the same stochastic algorithms of equations (6.8) and (6.13) for 
non-uniformity correction can also be used for histogram-equalized image 
coding. Pixel intensities are mean-rate encoded in a single-bit over sampled 
representation akin to delta-sigma modulation [9], although without the need 
for integration or any other processing at the pixel level. This could be 
compared with a popular scheme for neuromorphic multi-chip systems, the 
address-event communication protocol [10], in which sparse pixel-based 
events such as spiking action potentials are communicated asynchronously 
across chips. In the technique described here, addresses are not event-based, 
but are supplied synchronously with prescribed random spatial statistics. 

In particular, the image is coded in terms of the bits obtained by 
comparing z r ^ and z r( £- q as in equation (6.8) or z r \ k } and z r \k-i) as in 
equation (6.13). If larger, a ‘1’ symbol is transmitted; otherwise, it is a ‘O’. 
The selected address is either part of the transmitted code or it is generated at 
the receiver end from the same random seed. Thus, the code is defined as 

/fy i) ) = sign(z rW -z r( ,_ 1) ). 



(6.14) 
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Figure 6-2. Input intensity probability density function (top), and corresponding mean-rate 
transfer function (bottom) for intensity equalization and normalization. © 2001 IEEE. 

By selecting random addresses with controlled spatial statistics as in 
equation (6.4), this code effectively compares the intensity of a selected 
pixel with a base value that is either a global or a local average. The 
probability of 6 1’ is the fraction of pixels in that neighborhood with intensity 
lower than the present pixel. This is illustrated in Figure 6-2 , in which the 
mean-rate pixel activity is given by the cumulative probability density 
function 



Pr (/( Z r«) = 1 )= JT r<< ’ P(z)dz. 



(6.15) 



This corresponds to intensity equalization and normalization of the 
image, a desirable feature for maintaining a large dynamic range in image 
acquisition [11]. As seen in Figure 6-2 , the coding transfer function assigns 
higher sensitivity to statistically more frequent intensities. The uniform 
distribution and maximum entropy encoding obtained by this transformation 
is a well-known result and appears to take place in biological 
phototransduction as well [8]. The mechanism of image equalization as 
achieved here is unique in that it evolves from statistical techniques in an 
oversampled representation, and the statistics of the address sequence can be 
tailored to control the size of the neighborhood for different spatial scales of 
intensity normalization. 
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Figure 6-3. Circuit diagram of the floating-gate adaptive pixel. © 2001 IEEE. 



6.5 Focal plane VLSI implementation 

Rather than implementing equation (6.13) directly, the exponential 
relationship between voltage and current in a (subthreshold) MOS transistor 
is used to encode a current gain as the exponential of a differential voltage 
across a floating-gate capacitor. The increments and decrements ±A q in 
equation (6.13) are then naturally implemented by hot electron injection and 
tunneling across the floating-gate oxide [12]. The voltage on the floating 
gate is then a function of the charge, 



fg 



■■M 



Yel + 



Q_ 

c 



+6-F)V in 



EL J 



(6.16) 



where Q is the charge injected or tunneled onto or from the floating gate, 
= C E lI{C E l + C in ) ~ 0.3, and V E l is an externally applied global voltage for 
all pixels. The schematic of the floating-gate pixel is shown in Figure 6-3. A 
vertical pnp bipolar transistor converts photon energy to emitter current I in 
with current gain p. Transistors Mi and M 2 form a floating-gate current 
mirror with adjustable gain [13]. The output current I out of the pixel is 
sourced by transistor M 2 and measured off-chip. The gate and source of 
transistor M 3 provide random access pixel addressing at the periphery as 
needed to implement the stochastic kernel. For this pixel design, equation 
(6.16) establishes the following current transfer function in the subthreshold 
regime: 
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Figure 6-4. Pictorial interpretation of the contributions of 2, V EL and Q!C EL to the pixel 
current transfer function (a) for subthreshold output current and (b) for above -threshold 
output current. © 2001 IEEE. 




Figure 6-5. Layout of a 2 x 4 pixel sub-array. 
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Iou,=c{lJ- X exp 



V ^eiVt J 



f 



exp 



-KkV m 



\ Vt ) 



(6.17) 



where c = (IoW/Lqx p(V dd /V T )) A , h is the subthreshold leakage current, W 
and L are the width and length of transistors Mi and M 2 , V dd is the supply 
voltage, V T is the thermal voltage, and a: is the subthreshold slope factor. 

The first exponential factor on the right in equation (6.17) corresponds to 
the adaptive gain correction A, while the second exponential factor 
represents normalization, which is globally controlled by V E l- By injecting 
electrons onto (or tunneling electrons from) the floating gate [12], Q is 
incrementally (or decrementally) altered, which in turn logarithmically 
modulates A and thereby effectively implements the pilot rule of equation 
( 6 . 12 ). 

Figure 6-4 illustrates the effect of the various contributions to the pixel 
current transfer function I out through the floating-gate voltage V/ g as given by 
equation (6.17). Capacitive division between C in and C E l reduces the voltage 
swing on the floating gate Vf g by a factor (1-/1) relative to the input voltage 
V in . Through the logarithmic V-to-I transformation across the MOS transistor 
for subthreshold output current, this factor compresses the dynamic range of 
intensities in the output image, 



-cv in y 



(6.18) 



by a factor s=(l - A), as shown in Figure 6-4(a). Hot electron injection 
onto the floating gate modulates the charge Q , and thereby corrects the 
(relative) gain in each pixel individually by correspondingly lowering the 
floating-gate voltage V/ g . The electrode voltage V EL allows for a global shift 
of Vf g for all pixels, in either a positive or negative direction as shown in 
Figure 6-4(a) and 6-4(b). The effect either way is a global, electronically 
adjustable scale factor in the gain, which allows for automatic gain control. 
For lower values of V EL , which bring the transistor M 2 above the threshold as 
indicated in Figure 6-4(b ), a smaller compression factor e is obtained in the 
current transfer function although this factor e then depends on the signal. If 
the image is subsequently histogram equalized through the oversampled 
binary encoding, the nonlinearity in the transfer function s becomes 
irrelevant. 

The layout of a 2 x 4 pixel sub-array is shown in Figure 6-5. Note that 
while the layout minimizes pixel size, it does so at the expense of analog 
transistor matching. The consequences of this can be seen in the “before 
correction” image in Figure 6-8. 
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6.6 VLSI system architecture 



An array of 64 x 64 adaptive pixels was integrated with x and y random- 
access addressing decoders onto a 2.2 mm x 2.25 mm chip in 1.5 pm CMOS 
technology. A photomicrograph of the prototype fabricated through MOSIS 
is shown in Figure 6-6. 

Figure 6-7 illustrates the architecture of the chip and the setup used to 
experimentally validate the concept of reducing the gain mismatch between 
pixels on the prototype adaptive array. 

The imager was uniformly illuminated and a single column and row 
address r(x(T),y(T)) = r(T) was randomly selected. With switch Si closed 
and S 2 open, I out (k) was measured using a transimpedance amplifier to 
generate a voltage z r{k) . If f{z r( jf) = 0, Si was opened and S 2 momentarily 
closed. The drain of transistor M 2 was pulsed down to V inj ~ (Vdd~ 8F) and a 
small packet of negative charge was injected onto the floating gate. If 
f(z r (k)) = 1, the gain of the selected pixel was not altered and the process was 
continued by randomly selecting a new pixel. 

A one-sided version of the stochastic learning rule of equation (6.13) was 
implemented: 



^4(k) 



a f{ z oo) <0 ’ 

0 otherwise. 



(6.19) 



Because adaptation is active in only one direction, the average level (z) 
drifts in that direction over time. The coupling electrode voltage V EL can be 
used to compensate for this drift and implement automatic gain control. 

After gain non-uniformity correction, the imager can be used to acquire 
static natural images. Using random addresses with prescribed statistics, the 
output bit from the comparator f(z r(<k) ) can also be accumulated in bins 
whose addresses are defined by r(£). The resulting histogram then represents 
the intensity-equalized acquired image. 

6.7 Experimental results 

The 64 x 64 phototransistor-based imager was uniformly illuminated 
using a white light source. The pixel array was scanned before any gain 
mismatch correction and again after every 200 cycles of correction, until the 
correction was judged to be completed after 2800 cycles. Each of the 4096 
pixels was selected in random sequence every cycle. Figure 6-8 shows the 
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Figure 6-6. Photomicrograph of the 64 x 64 pixel adaptive imager chip. Dimensions are 
2.2 mm x 2.25 mm in 1.5 pm CMOS technology. © 2001 IEEE. 

evolution of the histograms built from the I out recorded from each pixel on 
the focal plane versus adaptation cycle number. Also shown are the scanned 
images from the chip before and after gain mismatch correction. 

The standard deviation of I out (i.e., <j Iout ) normalized to the mean (I out ) 
was measured and plotted versus (I ou t) before and after gain mismatch 
correction. Figure 6-9 plots these experimental results. The five different 
{lout) values correspond to five different levels of illumination, which have 
been labeled 1, 2, 3, 4, and 5. Adaptation was done at the illumination level 
corresponding to label 5. 

A black and white 3 5 -mm slide was projected onto the imager after gain 
mismatch correction and the array was scanned. The slide contained a light 
grey character “R” against a dark grey background (both bitmapped). The 
resulting image, as scanned from the imager chip, is shown in Figure 6-10. 

A 3 5 -mm grayscale (pixilated 64 x 64) image of an eye, shown in Figure 
6-1 1(a), was also projected onto the imager. The acquired image is shown in 
Figure 6- 11(b), and the histogram-equalized image (which was obtained 
from a 256-times-oversampled binary coding of the chip output) is shown in 
Figure 6-1 1(c). 
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Figure 6-7. Chip architecture and system setup for gain mismatch correction and intensity 
histogram equalization. © 2001 IEEE. 



6.8 Discussion 

Injecting a negative packet of charge onto the floating gate of transistor 
M 2 lowers its gate voltage and therefore increases its output current. 
Consequently, correction is in one direction only, increasing the current 
gain. Since the efficiency of charge injection depends exponentially on the 
magnitude of drain-source current through the device [12], pixels having 
higher I out will inject more each time their drains are pulled down to V inj . 
This positive feedback mechanism can be kept in check either by driving the 
source of the floating-gate p-FET transistor with a current source (as shown 
in Figure 6-12), or by setting V inj appropriately, keeping S 2 closed for a fixed 
time interval (« 20 ps), and having hysteresis in the comparator that 
computes /(z r ^)). The latter option was chosen here to provide simplicity in 
the test setup. 
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after correction 



before correction 





Figure 6-8. The time course of gain non-uniformity reduction as recorded from the adaptive 
imager chip. Also shown are images acquired before and after gain correction was performed 
under conditions of uniform illumination. © 2001 IEEE. 

Figure 6-12 shows three different schemes for performing hot electron 
injection onto the floating gate of a p-FET transistor. The first depicts the 
method used in the chip presented here, which as explained above can lead 
to unstable behavior unless the necessary precautions are taken. The second 
method uses a current source to set the current (I set ) that the p-FET transistor 
must source. The compliance of the current source must be such that it can 
sink I set at the appropriate drain voltage (V inj ) of the floating-gate p-FET 
transistor. The current I out will approach I set asymptotically and V inj will rise 
appropriately, so as to decrease injection. The third method is the one chosen 
for implementation in the latest version of this chip. Each time the drain of 
the floating-gate p-FET transistor is pulsed down to V inj , its output current 
I out will increase (almost) linearly, and the rate at which it increases can be 
set by I in j and the pulse width. I inj is common for all pixels in the array. 
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Figure 6-9. Experimental <Ji out / (I out ) versus (I ou t) for five different illumination intensities 
before gain correction (top curve) and after gain correction (bottom curve). © 2001 IEEE. 




Figure 6-10. Example image acquired from the adaptive imager chip after gain mismatch 
reduction. A light-grey letter “R” against a dark-grey background was projected onto the chip. 
© 2001 IEEE. 
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Figure 6-11. (a) Original image, (b) image acquired from the chip, and (c) equalized image 
obtained from oversampled binary coding (binning) of the chip outputs. © 2001 IEEE. 

The “before correction” scanned image in Figure 6-8 shows strong 
vertical striations in I out . After the gain mismatch correction procedure, these 
striations are no longer visible. However, five dark pixels (low I out ) can be 
seen in this image. These pixels are “stuck” off and therefore experience 
negligible injection when they are selected. In the new version of the 
floating-gate imager, the current source I inj prevents any pixels from staying 
in this “stuck” state. Ideally, an impulse would be expected in the histogram 
after correction, with all pixels having the same I out when uniformly 
illuminated. In reality, a single narrow peak is seen in the histogram due to 
the injection efficiency being proportional to current and due to hysteresis in 
the comparator. 

Figure 6-9 demonstrates that gain mismatch (and not just cr Iout l (I ou t)) was 
reduced as a consequence of increasing ( I out ) [14]. The pre- and post- 
correction data lie on two separate curves, demonstrating that there is indeed 
a dramatic reduction in gain mismatch due to adaptation. At low (I out ) (i.e., 
low illumination), there is a reduction in cr Iout l(I out ) from 70% to 10%. At 
higher (I out ) (i.e., high illumination), the reduction is from 24% to 4%. 

The scanned image of an “R” after adaptation shown in Figure 6-10 
gives a clear image mostly free of gradients and other fixed pattern noise 
present in the imager before compensation. The remaining “salt and pepper” 
noise (two pixels in Figure 6-10) is an artifact of the inhomogeneous 
adaptation rates under voltage-controlled hot electron injection in the setup 
of Figure 6-6, which can be alleviated by using the current-controlled setup 
of Figure 6-12. The “eye” image after intensity equalization in Figure 6- 
11(c) reveals more (intensity) detail, especially around the iris, than the 
acquired image in Figure 6- 11(b). 
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Figure 6-12. Three schemes for injecting charge onto the floating gate of the p-FET transistor 
in this pixel. 



6.9 Conclusions 

A compact pixel design and a strategy for reducing the gain mismatch 
inherent in arrays of phototransistors used in CMOS imagers have been 
introduced. It has been shown how the learning rule for offset correction can 
be transformed into the logarithmic domain to produce a stable learning rule 
for on-line gain mismatch correction. This rule is very naturally 
implemented by a simple translinear circuit. The pixel incorporates a 
floating-gate transistor that can be incrementally injected with a small packet 
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of negative charge. The injected charge increases the current gain of the 
pixel in relative terms (i.e., by constant increments on a logarithmic scale). 

Experimental results from a custom 64 x 64 phototransistor-based 
adaptive-pixel CMOS array (fabricated through MOSIS) prove that this 
pixel design and learning rule were successful for SNUC. In addition, 
intensity histogram equalization and digital coding of the output image were 
demonstrated in a binary oversampled representation, by means of the same 
random-addressing stochastic algorithms and architecture that was used for 
the adaptation. 
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MCF 

image, modulation (MCI). See MCI 
object, modulation (MCO). See MCO 
power, 41, 44, 47, 48 
Control word, 147 
Controlled spatial statistics, 209 
Control-timing unit, 161, 163 
Conversion gain (Cg), 80, 81, 86, 90, 96, 
108, 113, 123, 124, 135, \19.See also 
Charge-to-voltage conversion gain 
Convolution, 67, 157, 159, 160, 199 
kernel, 142, 143, 149-152, 158 
kernels, spatial, 206 

COR (Computation-on-readout), 142, 199 
Correction, adaptive non-uniformity. See 
Adaptive non-uniformity correction 
Correlated double sampling (CDS). See 
CDS 

Coupling electrode, 213 



Current 

gain, 8, 24, 116, 204, 205, 210, 215, 
220 

gain mismatch, 204, 205 
source, 17, 41, 42, 129, 179, 183, 188, 
192,215,216,218 
Current-domain 
circuits, 207 

image processing, 143-160. See also 
GIP 

Current-mode, 142, 199 
image processing, 143 
pixel, 41, 44, 45,48, 49 

information rate. See Information 
Cutoff wavelength, 11, 18 
Dark current, 19, 24, 29, 35, 36, 41, 46, 
88, 107, 110, 123, 124, 153, 179, 184 
DCT (discrete cosine transform), 160 
DDS (difference double sampling), 145, 
156, 161-163, 169 

Density-of-states effective mass, 6, 7 
Dependence 
quadratic, 154 

Depletion region, 16, 18, 19, 21, 22, 27- 
29, 35, 59, 63, 78, 83, 89, 93 
Detectivity, 11 
specific, 11 

Diagonal edges, 151, 160 
Difference double sampling (DDS), 145, 
156, 161-163, 169 
Diffusion 

length, 20, 23, 29, 66, 69, 72, 78, 83, 
89, 93 

photocarrier, 63, 65, 67, 72 
thermal, 35 

Digital processors, 152 
Digitally controlled analog processors, 
143, 147 
Direct 

bandgap material, 5, 10, 11 
transition, 8-10 

Discrete cosine transform (DCT). See 
DCT 
Drift 

fringing field, 35 
self-induced, 35 
Domain. See also Mode 

current. See Current-domain 
mixed, 170-198 
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voltage. See Voltage-domain 
Dynamic Range (DR), 31, 36, 39, 88, 

102, 106, 107, 110, 111, 154, 182, 

194, 204, 209 

approach. See also Autoscaling APS 
architecture of, 117-123 
experimental results 123-124 
widening, 114-117 
Edge 

circuitry, 170, 172, 176-178, 185, 

190, 198 

detection, 102, 150-152 
diagonal, 151, 160 
horizontal, 150 
vertical, 150, 151, 159 
Effective 

density of states, 6 
mass, 5, 6, 7 
Efficiency 

charge collection, 33, 34 
charge transfer, 33-35, 37 
injection, 218 
quantum (QE). See QE 
shutter, 113 
Electron 

as carrier, fundamentals, 2-8 
concentration, 7 
mobility (//„), 13 

Electron-hole pairs, optical generation of, 
8 

Electronic shutter, 106, 145, 170 
Emitter current, 24-26, 210 
Energy band structure, 2-5, 27 
Equation 

continuity, 14, 59 
Schrodinger, 3 

Exponent {EXP), 117-119, 122, 124 
External control over integration time, 
sensors with, 115, 116 
Extrinsic semiconductor, 7 
Face-centered cubic (fee), 3, 4 
False alarm, 124, 126, 127, 129, 133, 135 
Fat zero, 36 

Fermi energy ( E F ), 6, 27 
FF (Fill factor), 54, 60, 62, 69, 75, 80, 83, 
85,90, 96, 100, 105, 107, 108, 110, 
113, 114, 116-118, 122, 124, 125, 

132, 170, 172, 194, 204 
comparison for APS and CCD, 59 



definition, 39 

Field of view (FOV). See FOV 
Fill factor (FF). See FF 
Filtering 

background, 126, 127, 129 
temporal, 199 
Filters 

adaptive spatial, 128-135 
non-separable, 143, 152 
Fixed pattern noise (FPN). See Noise, 
fixed pattern (FPN) 

Fixed-bias mode, 144, 145, 152, 153, 155 

Flicker noise, 10, 204 

Floating-gate 

adaptation, neuromorphic, 203 
current mirror, 210 
oxide, 210 

pixel, 204, 205, 210, 212, 213, 215, 
216,218,219 

Floating-point numbers, 117, 118 
Focal plane, 57, 141, 142, 159, 170, 199, 
203-205 

analog image processing, 143 
VLSI implementation of non- 
uniformity correction, 210-215 
FOV (Field of view), 124 
FPN (fixed pattern noise). See Noise, 
fixed pattern (FPN) 

Frame transfer imager, 33, 34 
Frequency response, 55, 194 
Frequency-based sensors, 115, 116 
Fringing field drift, 35 
Function 
Bloch, 3 

logarithmic transfer, 173 
modulation contrast (MCF). See MCF 
optical transfer (OTF). See OTF 
phase transform (PTF). See PTF 
point spread (PSF). See PSF 
G. See Gain: definition of, (G) 

Gain (G) 

adaptive pixel current, 205 
charge-to- voltage conversion, 78 
control, automatic, 212, 213 
conversion (eg), 80, 81, 86, 90, 96, 
108, 113, 123, 124, 135, \19.See 
also Gain, charge-to-voltage 
conversion 
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Gain ( continued) 

adaptive correction, 212 
correction, 166, 203-207, 212-219 
current, 8, 24, 116, 204, 205, 210, 
215,220 
definition of, 14 
mismatch of current, 204, 205 
non-uniformity, 204, 207, 213, 216 
Gaussian 

channel, 40, 50 
image, 151 

General image processor (GIP). See GIP 
Generation-recombination noise, 15, 16 
GIP (general image processor), 143-160 
algorithm, 148-149 
further applications, 159-160 
hardware implementation, 143-148 
overview, 143 
scalability, 157-159 
test results, 150-157 
Global 
shift, 212 
shutter, 112, 113 
Gradient 

descent, 205, 207 
operators, 152 

Histogram equalization, 215, 220 
Hole 

as carrier, fundamentals, 5-8 
concentration, 6, 7, 14, 22, 23 
mobility ( ju p ), 13 
Horizontal edge, 150 
Hot electron injection, 210, 212, 216, 218 
Hysteresis, 215, 218 
Image 

delay, 143 
Gaussian, 151 
intensity equalization and 
normalization, 204 
modulation contrast (MCI). See MCI 
normalizing the acquired, 208 
processing 

current-domain, 143-160 
focal-plane analog, 143 
mixed-domain, 170-198 
spatiotemporal, 142 
voltage-domain, 161-169 
processor, general (GIP). See GIP 
sensor. See Imagers 



smear, 146 
Imagers 

centroid-tracking. See Centroid- 
tracking imager 
CCD 

comparison with APS 37-39, 99- 
102 

fundamentals 3 1-3 8 
CMOS, 54, 61, 89, 101, 102-117, 

136, 158, 205, 219. See also APS; 
PPS 

frame transfer, 33, 34 
general image processor. See GIP 
temporal difference. See TDI 
time difference. See TDI 
time delay. See TDI 
Indirect 

bandgap material, 5, 10, 1 1, 49 
transition, 8, 9 
Inertia, 207 
Information 

capacity, 40, 50 
fundamentals, 39 42 
mutual, 39, 40 
rate, 39-50 

for charge-mode pixels, 42^3 
comparison for different modes, 
46^9 

for current-mode pixels, 44^5 
for voltage-mode pixels, 45^9 
Injection efficiency, 218 
Integration 

capacitance, 79 
cycles, 145 
photocarriers, 86, 96 
time (7^0, 29, 38, 43, 46, 47, 79, 103, 
107, 113, 120, 121, 127, 161, 164- 
166, 170, 179, 183, 191 
autonomous control over, 115, 117 
external control over, 115, 116 
Integrative mode, 144, 145, 155, 156 
Intensities, statistically more frequent, 
209 

Intensity equalization, 208-209 
Interface traps, 35, 36 
Intrinsic 

carrier concentration, 7 
semiconductor, 6, 7, 21 
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Kernel, 67, 68, 72, 147 
coefficients, 152 

convolution, 142, 143, 149-152, 158 
locally differenced diffusive, 206 
spatial convolution, 206 
spatial derivative, 160 
spatiotemporal, 156, 159 
stochastic, 210 
Laplacian, 149-152 
operators, 152 

Lateral collecting surface, 82, 84 
Lattice constant, 8 
Layout-dependent parameter, 76 
Leakage, 18, 87, 113, 184, 190, 195, 212 
currents, 88, 146, 163-169, 179 
Lifetime of carrier (r), 12, 14 
Linearity, 116, 179, 185, 191, 199 
Locally differenced diffusive kernel, 206 
Logarithmic 
mode, 154 

photodiode, 111-112, 115 
transfer function, 173 
transformation, 205, 207 
Majority carrier, 7 
Mantissa (Man), 117, 119, 122, 124 
Mass-action law, 7 
Matching, 127, 156, 170, 180, 212 
Maximum entropy coded output, 208 
MCF (modulation contrast function), 57 
MCI (modulation contrast image), 56, 57 
MCO (modulation contrast object), 56 
Memory, embedding, 207 
Minority carrier, 7, 29, 35, 59, 60, 65, 84, 
195 

Mobility (//), 5, 35, 88, 89, 93 
Mode. See also Domain 
current, 142, 199 

fixed bias, 144, 145, 152, 153, 155 
integrative, 144, 145, 155, 156 
logarithmic, 154 
mixed, 170-198 

snap shot and evaluate, 165, 168 
strong inversion, 154, 155 
Modulation 

contrast function (MCF). See MCF 
contrast object (MCO). See MCO 
transfer function (MTF). See MTF 
Monotonicity, 208 
Motion detection, 102, 199 



MTF (modulation transfer function), 73 
fundamentals, 53-58 
experimental measurement, 62-67 
modeling, CMOS APS, 59-62, 67-72 
Multi-mode sensors, 115, 116 
Mutual information, 39, 40 
Navigation, autonomous, 159 
NEP (noise equivalent power). See Noise, 
NEP 

Neuromorphic 
design, 142 

floating-gate adaptation, 203 
Nodes, 

sense, 107, 110 
summing, 159 
Noise 

cancellation circuits, 157 
fixed pattern (FPN), 36, 38, 105, 107, 
110, 112, 116, 123, 124, 135, 152, 
155, 161, 163, 169, 181, 193-195, 
199, 204,218 

column-to-column, 39, 192, 

193 

pixel-to-pixel, 192 
flicker, 10, 204 

generation-recombination, 15, 16 
NEP (noise equivalent power), 10, 16, 
20, 26, 49 
pixel-to-pixel, 192 
pixel-to-pixel FPN, 192 
shot, 10, 19, 24, 25, 29, 36, 43^15, 48, 
88, 110, 129, 179, 180 
SNR (signal-to-noise ratio), 10, 15, 

16, 20, 25, 26,30, 46, 54, 59, 72, 
80, 106, 108, 110, 114, 156, 157, 
162, 182, 183, 194 
thermal, 10, 15, 16, 19, 30, 48, 180, 
187 

Noise equivalent power (NEP). See 
Noise, NEP 
Nonlinearity, 212 
Non-separable filters, 143, 152 
Non-uniformity 

correction, 204, 205, 208 

adaptive. See Adaptive non- 
uniformity correction 
scene-based (SBNUC), 205-207 
static (SNUC), 205-207, 220 
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Non-uniformity ( continued) 
gain, 204, 207,213,216 
cancelling, 207-208 
Normalizing the acquired image, 208 
Offset correction, 204-207, 219 
On-line correction, 204 
Optical 

generation of electron-hole pairs, 8 
transfer function (OTF). See OTF 
Orientation selectivity, 152 
OTF (optical transfer function), 54, 55 
Overhead, 152, 204 
Oversampled binary encoding, 212 
Partial event, 34 

Passive pixel sensors (PPS). See PPS 
Phase transform function (PTF). See PTF 
Photocarrier. See also Carrier 
diffusion, 63, 65, 67, 72 
integration, 86, 96 
Photocharge 
collected, 79 
Photoconductive, 15, 17 
Photoconductors, 11-16, 18 
Photocurrent, 17-19, 21, 24, 25, 29, 41- 
50, 76, 78, 93, 111, 112, 143-146, 
153, 154, 173, 184, 185, 187-190 
primary, 13, 14 
Photodetector 

fundamentals of, 8-3 1 
photoconductors, 11-16 
photodiodes, 16-24 
photogates, 26-31 
phototransistors, 24-26 
PIN diode, 21-22 
Photodiode 

fundamentals, 16-24, 106-109 
junction capacitance, 78 
logarithmic, 111-112, 115 
model for photoresponse, 81-85 
PIN, 8,21 

pinned (p + -n-p), 106, 110-111 
snapshot, 112-113 
Photogate, 26-31, 37, 109-111 
Photomicrograph, 213, 214 
Photon energy, 29, 210 
Photoresponse, 54, 75, 88 
comparison of model wih 
experiments, 85-87 
fundamentals, 76-80 



model for photodiode pixels, 81-85 
model predictions for scalable CMOS 
technologies, 90-96 
Photosensor, 58, 116, 173 
Phototransduction 
biological, 209 
stage, 107 

Phototransistor, 205, 213, 220 
fundamentals of, 24-26 
Photo voltage, 17, 108, 162-165 
Photovoltaic, 17, 18 
Pilot rule, 212 
PIN photodiode, 8,21-22 
Pinned photodiode (p + -n-p), 106, 1 10— 
111 

Pipeline 

readout technique, 161, 163, 164, 168 
timing, 163 
Pixel 

active. See APS 

addressing at the periphery, random 
access, 210 

APS, 37, 40, 50, 88, 90, 96, 170, 171, 
178, 179, 184, 194. See also APS 
fundamental types, 105-1 13 
logarithmic photodiode, 111- 
112 

photodiode, 106-109 
photogate, 109-110 
pinned photodiode, 110-111 
snapshot, 112-113 
area, 39, 59, 86, 113, 157, 161 
bad, 124, 126, 129, 130, 131 
centroid-mapping, 193 
centroid-tracking, 174, 185, 187 
charge-mode, 41-44, 46-49 
current-mode, 41, 44, 45, 48, 49 
floating-gate, 204, 205, 210-219 
grouping, 149 

information rate. See Information 
intensity, statistics of, 204 
passive. See PPS 

photodiode, model for photoresponse, 
81-85 

snapshot, 112, 113 
voltage -mode, 41, 44^-6, 48, 49 
Pixel-level ADC, 114 
Pixel-parallel processing, 143, 199 
Pixel-serial processing, 199 
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Pixel-to-pixel 
FPN, 192 
noise, 192 
variations, 107 

Point spread function (PSF). See PSF 
Positive feedback, 215 
Positivity constraint, 207 
Potential, surface (y/ s ), 27-29, 35 
PPS (passive pixel sensors), 102-105 
Primary photocurrent, 13, 14 
Process parameters, 154 
Process-dependent parameters, 76 
Processing. See also Image processing 
block-parallel, 199 
pixel-parallel, 143, 199 
pixel-serial, 199 
sequential, 199 
spatial, 145, 150, 151 
spatiotemporal image, 142 
temporal, 142, 145, 159 
PSF (point spread function), 57-58, 62- 
72 

PTF (phase transform function), 55 
QE (quantum efficiency), 10, 17, 20, 21, 
23, 28, 29, 33, 49, 75, 79, 85, 89, 96, 
100, 105, 110, 111, 124, 179 
Quadratic dependence, 1 54 
Quantum efficiency (QE). See QE 
Random access pixel addressing at the 
periphery, 210 
Readout 

circuit for, 34, 41, 42, 77, 94, 103, 

106, 162, 179, 185, 194 
computation on (COR). See COR 
pipeline, 161, 163, 164, 168 
rate, 100, 105, 116, 120, 180, 184 
rolling, 108, 109, 112, 113, 171 
snap-shot-and-evaluate technique, 168 
technique, 105, 112, 161, 166 

Reset 

column, 121, 123 

row (RRST), 121, 123, 170 

stage, 42, 107 

transistor ( Reset ), 42, 43, 104, 106, 

107, 123, 161, 170, 182 
Resolution 

spatial, 33, 53, 54, 114, 117, 119, 127, 
131,204 

temporal, 117, 119 



Response, photo-. See Photoresponse 
Responsivity, 10, 24, 49, 62, 123 
Retinomorphic systems, 142 
Rolling 

readout technique, 108-109, 112-113, 
171 

shutter. See Rolling readout technique 
Row 

reset (RRST). See RRST 
select transistor, 106, 108, 122 
RRST (row reset), 121, 123, 170 
S/H (sample-and-hold), 107, 113 
Sample-and-hold (S/H). See S/H 
Sampling 

correlated double (CDS), 38, 42, 107, 
108-110, 127, 128, 131, 145, 157, 
171, 172, 179-185, 192, 193, 199 
difference double (DDS), 145, 156, 
161-163, 169 

Saturation level, 115, 123, 124, 169 
SBNUC (scene-based non-uniformity 
correction), 205, 206, 207 
Scalability, 142, 157, 158 
Scalable CMOS technologies. See CMOS 
Scaling 

auto-, 114-125 
factor, 117, 118 
technology, 78, 88, 89 

impact on pixel sensitivity, 88-90 
Scanning registers, 143, 146-148, 160, 
161 

SCCD (surface channel CCD), 36 
Scene -based non-uniformity correction 
(SBNUC). See SBNUC 
Schrodinger equation, 3 
Self-induced drift, 35 
Semiconductor image sensors. See Sensor 
Sense node, 107, 110 
Sensitivity, 8, 17, 31, 72, 88-90, 92, 100, 
103, 158, 179, 185, 186, 191,204, 209 
Sensor 

active pixel (APS). See APS 
with autonomous control over 
integration time, 115, 117 
autoscaling, 114-125 
CCD 

comparison with APS 37-39 
fundamentals 31-38 
companding, 115 
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Sensor ( continued) 

with external control over integration 
time, 115, 116 
frequency-based, 115, 116 
multi-mode, 115, 116 
passive pixel (PPS). See PPS 
semiconductor image, fundamentals 
of 3 1-39 

smart tracking, 114, 124-135 
Sequential processing, 199 
SF (source follower), 41, 42, 77, 105- 
107, 110, 113, 161, 162, 183, 192 
Shot noise. See Noise, shot 
Shutter efficiency, 113 
Sidewall capacitance, 78, 82, 84 
Signal chain, 117 

Signal-to-noise ratio (SNR). See Noise, 
SNR 

Smart tracking sensor, 114, 124-135 
Snap shot 
and evaluate 
mode, 165, 168 
readout technique, 168 
pixels, 112-113 

SNR (signal-to-noise ratio). See Noise, 
SNR 

SNUC (static non-uniformity correction), 
205, 206, 207, 220 
Source follower (SF). See SF 
Spatial 

convolution kernels, 206 
derivative kernels, 160 
filter, adaptive, 128-135 
processing, 145, 150, 151 
resolution, 33, 53, 54, 114, 117, 119, 
127, 131,204 
Spatiotemporal 

image processing, 142 
kernels, 156, 159 
Specific detectivity, 1 1 
Spiking action potentials, 208 
Split event, 34 

Static non-uniformity correction (SNUC). 
See SNUC 

Statistically more frequent intensities, 

209 

Statistics of pixel intensity, 204 
Stochastic 

adaptive algorithms, 204 



kernel, 210 

Strong inversion mode, 154, 155 
Subthreshold region or regime, 111, 115, 
173,210 

Summing nodes, 159 
Surface 

channel CCD (SCCD). See SCCD 
potential (y/ s ), 27, 28, 29, 35 
lateral collecting, 82, 84 
Switched capacitor circuit, 171 
System-on-a-chip, 102, 113-136 
System-on-chip, 142 
TDI (temporal difference imager), 161 — 
169 

hardware implementation, 161-163 
overview, 161 

pipeline readout technique, 163-165 
snap shot and evaluate mode, 165-166 
test results, 166-169 
Technology scaling, 78, 88, 89 

impact on pixel sensitivity, 88-90 
Temporal 

difference, 161-168, 199 
difference imager (TDI). See TDI 
filtering, 199 
processing, 142, 145, 159 
resolution, 117, 119 
Thermal 

diffusion, 35 

noise. See Noise, thermal 
Three-phase clock, 32 
Threshold, 43, 107, 116, 117, 120, 123, 
126-128, 153, 154, 173, 177, 181, 

186, 188,211,212 

Time delay imaging. See TDI (temporal 
difference imager) 

Time difference imaging. See TDI 
(temporal difference imager) 

Tracking sensor, smart, 114, 124, 126, 
128-131, 135 

Transimpedance amplifier, 213 
Transistor 

photo-. See Phototransistor 
reset, 42, 43, 104, 106, 107, 123, 161, 
170, 182 

row select, 106, 108, 122 
vertical pnp bipolar, 210 
Translinear floating-gate MOS circuits, 
204 
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Tunneling, 88, 210, 212 
Valence band, 5-8 
Vertical 

edge, 150, 151, 159 
pnp bipolar transistor, 210 
VG (virtual ground) circuit, 146-147, 
158-159 

Virtual ground (VG). See VG 
Voltage-domain image processing, 161- 
169 



Voltage-mode pixel, 41, 44-46, 48, 49 
information rate. See Information 
Voltage swing, 79, 115, 161, 162, 168, 
186,212 

Winner-take-all (WTA). See WTA 
WTA (winner-take-all), 102, 114, 124- 
127, 130-132, 135 
X-addressing circuitry, 1 08 




